Welcome to


DOGSBYTE.COM





BLScan © 2004
BLScan is a utility for management of a
Mailwasher blacklist file

MailWasher is a product of Firetrust, a commercial package that offers multiple strategies for the effective control of SPAM.




For several years I have been using Firetrust's excellent Mailwasher program to filter my incoming emails. This has proven to be very successful, with close to 100% of my email being recognised correctly as legitimate or otherwise. One of the anti-spam strategies employed by that product is a 'blacklist' function. Over time however, the associated blacklist file can get quite large and unwieldy. I wrote BLScan to help manage this file.

This page documents the BLScan utility and it makes no difference whether you're using the Classic or Pro version of BLScanPlus unless otherwise specified. The Pro version of BLScanPlus has a number of additional features that are documented here.



*

Scan
Mode

*

Manual
Wildcarding

*

Press the Play button for some soothing music

BLScan


BLScan is a utility that analyzes a Mailwasher Blacklist file. Mailwasher stores blacklisted email addresses in a file called Blacklist.txt. This file is readable and is stored in the Mailwasher "Application Data" folder. The path to this folder can be obtained from the Mailwasher Help->About dialog box (it is the link at the bottom of that dialog box).

The purpose of BLScan is to hunt down frequently occurring domains and to replace them with wildcard statements. This not only reduces the size of the blacklist file, but also prevents any more unwanted email from those domains making it to your browser window (depending on your Mailwasher settings). My blacklist is now my strongest spam defence, click here to see some example screen shots from my Mailwasher statistics that show how effective my blacklist has become.

BLScan has two modes;
  • The first is 'scan' mode where the blacklist file is examined and a number of reports are generated providing information about the blacklist content. As well as the reports, this mode also generates lists of wildcards that can simply be back annotated (by manual editing) into the source blacklist file. So scan mode examines the blacklist file and can provide wildcard statements which can be manually inserted into the Blacklist file.
  • The second mode is 'manual wildcarding', this automates this process and undertakes a step by step trawl through the blacklist file. When a given domain occurs more than a certain number of times, the user can choose whether to leave it as is, replace all occurrences with a single wildcard statement or 'ignore' it so that it in future it is not reported. Ignoring a particular domain is useful for friendly domains that you know are not spam sources. Using manual wildcarding results in a new Blacklist file that replaces your existing Blacklist file.
So remember that;
  • Scan mode generates summaries and is useful for establishing what scope exists to optimize the file.
  • Manual wildcarding sets this optimization in motion and generates a new blacklist file that can be used to replace your current blacklist file.
If you do any manual edits to the blacklist file, you should never copy/edit/rename any files in the Mailwasher Application Data folder while Mailwasher is running (because they will be overwritten when you exit Mailwasher).




BLScan Functions *

Use
BLScan

to
reduce
the
size
of
your
blacklist

*

Block
even
more
spam

*

improve
speed

*



Both Scan and Manual Wildcarding require the blacklist file for input. This can either be copied into the BLScanPlus program folder, or simply provide the full path to the blacklist file in the Mailwasher folder. This path can 'hardwired' using the Settings menu/tab.


Scan;

This scans the blacklist and primarily reports how many blacklist entries (email addresses) are from the same domain. The Scan function also generates wildcards and domain reports. The output is written to five output files;






BLScanT

(Totals)
This file carries a list of all the domains found and how often they occurred. If any user names appear multiple times, they are also reported in this file. It is not recommended to wildcard user names since this may blacklist out legitimate email, however the summary list is still generated and the appropriate wildcards are also reported in the BLScanM file.

The BLScanT file also carries two additional reports. Firstly, a report of 'embedded' domains (such as msn.spam.com, yahoo.junk.com, trash.hotmail.com) and secondly a 'word incidence' report which will show similar domains (such as a.spammer.com, msn.spammer_123.com, bigspammer.com). By examining these reports, patterns can often become evident in spam source addresses/domains and new wildcards can easily be identified to combat them.






BLScanM

(Multiples)
This file  carries a list of the wildcard statements that you could manually cut/paste directly into your blacklist file to blacklist all of the domains that occurred multiple times. In practice, you would not use all of the wildcards in this file since you would end up blacklisting domains that would almost certainly include many popular domains used by your friends. So you should only cut and paste the wildcards that will screen out pest domains.





BLScanA

(All)
This is basically the same information as the BLScanM file, but this file carries a list of the wildcard statements for EVERY domain that was present in your blacklist file. Again, in practice, you would not use all of the wildcards in this file since that would mean you would end up blacklisting entire domains such as hotmail.com for example. This would mean that any email from hotmail.com would be blacklisted, so you should only cut and paste the wildcards that will screen out pest domains.




BLScanS

(Sort)
This file is optional and lists all the blacklist entries sorted by user name, domain and by date. The sort is actually a two column sort and three consecutive lists are generated with a precedence of username/domain, domain/username and date/domain. This option is only available in scan mode, but if you generate a new blacklist file using BLScan and want to see the entries sorted, you can run BLScan in scan mode and enter the new blacklist file name.






BLScanX

(EXchange)
The Scan function will also optionally generate an eXchange file. This file is intended for exchange with other users. The file contains a list of all of your wildcard expressions, less any country specific wildcards (that have the two letter country code appended). This file can then be exchanged between users to share wildcards. A utility called BLComp allows Blacklist files or exchange files to be compared.





Manual Wildcarding;

Manually editing the blacklist file, doing cut/paste from the results of a scan is certainly one option for cleaning up a blacklist and dealing with multiples. But BLScan also has a more efficient method for doing this called 'manual wildcarding'. The manual wildcarding function will scan the blacklist as before, but prompt you as to whether you want to wildcard a particular domain. If you do want to wildcard it, this mode will delete all occurrences of that domain, insert a wildcard and move onto the next. When you're done, BLScan will create a new blacklist file that you can substitute for your current blacklist file. This action generates one output file;





BLScanB

(Blacklist)
A new blacklist file with blacklist entries removed (for the domains you specified) and replaced by wildcard statements. This new blacklist file can be used to replace your existing blacklist file. This step can be done manually by the user, or set to be done automatically using the "Settings" options. To do this manually, rename this file to Blacklist(.txt) and copy it into the Mailwasher Application Data area (either overwrite the old file or rename the old file to something else). If get into trouble doing this, a backup copy is available of your original blacklist file and this is present in the BLScanBackup folder within the BLScan installation folder.

An additional option is also available when manual wildcarding, this is the 'ignore list'. During manual wildcarding, you are prompted as to whether you want to blacklist the entire domain, a third option is to 'ignore' the domain. This writes the domain name into an 'ignore list'. The next time you do manual wildcarding, you can use the ignore list and BLScan will not prompt when/if any of these domains are encountered. The purpose of this is to protect your favourite domains against accidental wildcarding. Many domains such as msn.com should probably never be wildcarded, so this is a good candidate for the ignore list. The ignore list can be reset and also edited at any time.



BLScan
is
Easy!








BLScan Options

Manual wildcarding and Scan options;

File name If not specified (using the Settings menu), BLScan requires a blacklist file name, it does not assume it will be called Blacklist.txt, this means that you can use BLScan in a separate folder, check it out, mess around with the settings etc. without disturbing your current blacklist file. (BLScan will also automatically create a backup of any file it reads in). The input file must be a valid blacklist (i.e. correct file format), the ordering of the blacklist and whitelist is unimportant.
Find
out
which
domains
are
most
prevalent







Threshold
A threshold number. Domains will only show up if they occur more than or equal to this threshold number (so with a threshold of 5, domains that occur in your blacklist file less than 5 times will not be reported). The reporting of usernames that appear multiple times is unaffected by the threshold setting, this is because there are generally so few that a threshold >2 will probably not show any at all.






Time stamp
Timestamp output files. This adds a timestamp onto the name of the output files, so you can tell which are old/new. If this option is not selected then BLScan will overwrite any previously generated output files. Since one file is produced during manual wildcarding and three to five during scan, this timestamp helps to keep track of the order in which files were generated. Selecting View->Details in your file browser can also show the date/time order. (For BLScanPlus "Pro", timestamping is set on the Settings tab).






Date code
Blacklist entries have an associated date code. This option allows new entries to be assigned the current date or a value of zero. Mailwasher expires underused blacklist entries after a certain number of days (set in Mailwasher under "Spam Tools->Blacklist->List options"). If you use the current date, the countdown to expiry begins immediately, if you use zero the countdown will not begin until the blacklist entry has been used at least once.

If you use a date code of zero and the domain never re-occurs, that wildcard will remain in the blacklist indefinitely.






Manual/Scan Select manual wildcarding or scan. As described, manual wildcarding allows you to step through the blacklist and choose whether to wildcard any domains that occur more than a certain number of times (set by the threshold). Scan analyzes the blacklist and reports potential wildcards and summary information. It is worth running scan to begin with, you can then see what can be achieved by the introduction of appropriate wildcards.






Extra options for Scan;
Choose
which
domains 
to
blacklist

using
manual
wilcarding


Sort Option to generate a file called BLScanS, with the blacklist entries sorted by user name, domain and by date (as per the description of the BLScanS file above). Three consecutive lists are produced, the sort is actually a two column sort and the lists generated are with a precedence of username/domain, domain/username and date/domain. The date sort includes the calendar date in the UK format of dd/mm/yyyy.






Exchange
Option to generate a file called BLScanX. This file contains all of your Blacklist wildcard entries less any country specific domains (which have the two letter country code appended). This file can be exchanged with other users to share wildcards. A comparison utility called BLComp allows Blacklist files or exchange files to be compared. BLComp can in fact compare any list of email addresses/wildcards with any other form of list.






Extra options for Manual Wildcarding;

Retain Retain redundant wildcards. If you already have some wildcards in your blacklist, these may become redundant. So if you already have a wildcard like *_sale@spam.com and BLScan generates a new wildcard such as *@spam.com, the original wildcard is redundant. This option allows you to either get rid of any existing wildcards that become redundant or retain them.





View View domain list. This option controls the verbosity of BLScan.

For the "Classic" version, this option will show all relevant blacklist entries during the manual wildcarding process.

For the "Pro" version, this option controls whether the same entries mentioned above are written into the log file.






The equivalent options are available in both the "Classic" and "Pro" version of BLScanPlus (although their appearance may differ). In either version, the options used during a BLScan session can be stored for use next time around.




BLScan Current Version & Futures
BLScan
is
included
with
BLScanPlus



The version history is given below;
  • 1.0
    • First release.
  • 1.1
    • Fixed time/date compatibility (US versus UK time format).
  • 1.2
    • Modified sort routine to automatically delete redundant blacklist entries if a user name is already fully wildcarded (instead of prompting the user).
  • 1.3
    • Added sorted output option, generates a new file (BLScanS) with blacklist sorted by user name, domain and by date.
    • Added report of duplicated user names and appropriate wildcards, these are reported in the totals file (BLScanT).
    • Various code improvements to speed up parts of the program, improve maintenance and tidy up some of the output formatting.
  • 1.4
    • Added 'embedded' domain report to BLScanT file.
    • Added 'word incidence' report to BLScanT file.
    • Settings used during a BLScan session are now stored on exit for use next time around.
    • BLScan will not report duplicated user names if already wildcarded.


  • 1.5
    • Added 'Ignore list' to allow chosen domains to be ignored during Manual Wildcarding (they can be reset/edited).
    • Added 'Exchange' file option to generate a list of blacklist wildcards for exchange with other users.


  • Integrated into both BLScanPlus Classic 2.0 and BLScanPlus Pro 3.0
    • Added option to read the blacklist file and also to output the new blacklist file directly from/to the Mailwasher folder.
    • Added function to remember the path and file name of the blacklist file used in the session.


Back to top




Visit Firetrust Website
Google
Visit Mailwasher Forum