home old db
documentation:zdkimfilter zdkimsign redact zaggregate zfilter_db zdkimfilter.conf zdkimgenkey




zfilter_db - zdkimfilter's database test tool  


zfilter_db [option [option-arg... ]...  


This man page describes the database configuration options of zdkimfilter, and then briefly documents the corresponding debug tool.

zfilter_db connects to the database so as to exercise some of the database configuration options and queries, as if a message was received and filtered. Details of the simulated message are set via COMMAND LINE OPTIONS (below).

The database connection is done via OpenDBX. See www.linuxnetworks.de/doc/index.php/OpenDBX for details about specific backends. The design of zdkimfilter database interface provides for a set of queries configured in zdkimfilter.conf and set of variables hard coded in the executables. A reference example is provided.

Although a database is not required for basic sign and verify operations, it is needed for DMARC.  


Database entries in zdkimfilter.conf start with "db_". There are three groups of them: (1) Connection, (2) options proper, and (3) queries. The first group controls the connection itself:
This is one of the DB engines supported by OpenDBX, which include "firebird", "mssql", "mysql", "oracle", "pgsql", "sqlite", "sqlite3", and "sybase".

The name "test" is recognized internally and not passed to OpenDBX. In that case, instead of running a query, the database functions print out a dump of the allowed variables and their values. Such dump is appended to file database_dump in the current directory if the function is linked within zdkimfilter (used in the testsuite), otherwise it goes to stdout as shown in EXAMPLES.

Depending on database type and configuration, this can be a host name, an IP address, a path to a Unix socket, and more. It is a string passed to odbx_init as is.
db_user, db_password, db_database
These parameters are passed to odbx_bind verbatim.
This is a number of seconds. Not all backends use it. Defaults to 2.
Can be "Y" or "N". If enabled, every SQL statement is logged at debug level.

When not enabled, only failed queries are logged.

The second group are the options proper. These have names starting with "db_opt_" and are passed to odbx_set_option, if given, before binding.

Can be "A" for always, "N" for never, or "T" for try.
Can be "Y" or "N". As with all boolean variables, "Y" is assumed if the option is given. However, if not given, no call is made to ODBX about it.

Note that the reference example has a query with multiple statements (db_sql_insert_message), so this option is required if staying with it.

Can be "Y" or "N". See above for defaults.
Mode of operation. This is a string that ODBX passes to the backend.
Number of result records to transfer; 0 means all.

The third groups, queries, are made of items whose names start with "db_sql_". Some queries may or should return values; this doc does not distinguish between queries and statements.

Each query can use a number of variables whose values are collected while processing a mail message or while preparing a report. A variable is identified by a string of the form $(variable-name). Each variable identifier is replaced by the current value, irrespective of any quotation. Missing variables result in the empty string.

Queries in odbx_example.conf exemplify most of the descriptions given below, using stored procedures defined in odbx_example.sql.

Queries, in turn, can be subdivided into four subgroups: (1) per-domain settings, storage of (2) incoming and (3) outgoing messages, and (4) reports. These subgroups characterize when the relevant queries are run. In the rest of this section there is a subsection for each subgroup, but traits and variables common to (2) and (3) are collected in their own subsections.  

Per-domain settings

These settings are queried before signature verification of incoming messages. The first query, new in version 1.5, extends the second one.
Three variables are allowed:
A domain name related to an incoming message, for example a DKIM signer.
The IP number of the client. For IPv4 this is given in the classic dotted quad format.

The INET6 data type used in the example requires a "::ffff:" prefix.

The reverse DNS name of the client. This variable is available if a reverse DNS PTR RRset is defined and if looking it up by Courier is enabled (by setting the "BOFHCHECKDNS" option in the esmtp configuration file).

The query runs for the From: domain, for the original From: domain if From: was apparently munged, their organizational domains, for any domain mentioned in SPF or DKIM results, for the sender domain and for the list domain.

It can return one to four integer values, in this order:

This is the internal whitelisting, not to be conflated with dnswl.

A negative value, -1, triggers shoot_on_sight mode. The message is rejected ``for policy reasons'', whether the domain is authenticated or not, unless overridden by another whitelisting (see whitelisted_pass below) or by dnswl (greater or equal to dnswl_worthiness_pass).

A value of 0 is ignored.

A value of 1 or higher affects the order in which DKIM signatures verification is attempted and reported. This value can be increased automatically to 1 when a DKIM signature of the domain verifies successfully, to 2 when sending to a domain from a personal address (not an rfc2142 address.)

When this value is greater or equal to the whitelisted_pass configured value, it prevents a message from being rejected or dropped due to DMARC or ADSP policy, shoot_on_sight or action_header. The domain needs to be authenticated and not flagged shoot_on_sight itself.

Add to the global honor_dmarc setting. If DMARC is globally honored, a value of -1 disables it for this domain. A value of 1 enables it for this message, even if DMARC is not globally enabled.

Unless domain is aligned with org_domain, this result is discarded.

Add to the global honor_author_domain setting. Like the previous item, this is added (with sign) to the global value.

Unless domain matches the From: address, this result is discarded.

This can be set to COUNT(*) in order to learn how many records there were. Currently not used.
This query is kept for compatibility, and is only used if the previous one is not defined. It can use the same three variables:
A domain name.
The IP number of the client.
The reverse DNS name of the client. This variable is available if a reverse DNS PTR RRset is defined and if looking it up by Courier is enabled (by setting the "BOFHCHECKDNS" option in the esmtp configuration file).

This query should return a single integer, the whitelist value.


Traits and variables common to incoming and outgoing messages

After filtering is complete and response to Courier given, zdkimfilter stores relevant message data to the database. Each message is either incoming or outgoing, but the following thirteen variables are available in either case:
ino, mtime, and pid
These are the values used by Courier to identify a message, the inode, the date and the process-id. These variables are given as hex string, but without the leading "0x". The following MySQL expression reconstructs the id value found on the last line of Courier's Received. The expression uses field names, rather than variables, since ino, mtime, and pid are also key field names in the message tables given as example.

   CONCAT_WS('.', LPAD(HEX(ino), 16, '0'), LPAD(HEX(mtime), 16, '0'), LPAD(HEX(pid), 8, '0'))

from, date, message_id, subject, content_type, and content_encoding
These are some of the header fields in the message. Values are trimmed, Content-Type is truncated at the semicolon if any. Case is preserved, since database systems are case insensitive.
This flag is 1 if the message has some header fields that are typical of mailing lists. Some "List-*" fields, "Precedence:list", and "Mailing-List:".
The bounce address, also known as envelop from or mailfrom.
The IP number of the client. For IPv4 it is given in the classic dotted quad format; that is, without "::ffff:".
The reverse DNS name of the client. This variable is available if a reverse DNS PTR RRset is defined and if looking it up by Courier is enabled (by setting the "BOFHCHECKDNS" option in the esmtp configuration file).

Storing incoming messages synopsis

Relevant incoming message data can be stored in the database by defining some or all of the queries below. Besides common variables, the queries in this group can use two additional sets of variables, one for the message proper, and one for each domain related to that message.

incoming message variables

This is one of the three values: "accept", "reject", and "drop", which represent the message disposition as far as zdkimfilter knows.
DMARC view of DKIM validation of the whole message, it is one of:
if a signature by an aligned domain passed.
if no aligned signature passed.
if no aligned signature or no DMARC record were found.
DMARC view of SPF authentication, it is one of:
if a domain is authenticated with either BOFHSPFHELO or BOFHSPFMAILFROM and it is aligned in the way (relaxed or strict) specified by the DMARC record of the author's domain.
if SPF authentication did not pass, or the authenticated domain was not aligned as required.
if no DMARC record was found.
The message disposition, as far as DMARC is concerned. It is not necessarily consistent with message_status. The value is one of:
No policy impact on message disposition.
The message is treated ``as suspicious'', to use RFC 7489 words. (All what zdkimfilter does is to write ``(QUARANTINE)'' in the Authentication-Results. Any enforcement is passed on to mail delivery.)
The message is rejected because of remote DMARC policy.
Currently, zdkimfilter sets this value as follows:
dmarc_dispo is in agreement with the remote DMARC policy.
Not used.
Remote DMARC policy applies to a percentage only, and the luck of the draw exempts this message.
An authenticated domain is whitelisted >= whitelisted_pass or is in DNSWL >= dnswl_worthiness_pass.
MLM transformation applied and at least one signature was recovered that way.
None if DMARC, ADSP, or reject_on_nxdomain applies. Remote DMARC policy is not being honored for this message.
DMARC policy failed, but BOFHSPFFROM passed. Note that SPF protocol does not provide a scope for "From:"; however, it may happen that a message has no signature, an empty envelope sender, a mismatched BOFHSPFHELO, but a good BOFHSPFFROM.
A comma separated list of tokens. One of the three possible values of the policy, "unknown", "all", and "discardable"; along with "found", in case the (unknown) status was not inferred, and possibly "fail".

This variable is only set if an ADSP record was found. Since v1.5, ADSP records are not always looked up.

The number of "Received:" header fields.
The number of DKIM signatures.

incoming domain variables

An incoming message can be related to a number of domains. The following set of variables are specific to each of them:

The domain name.
Sometimes called auth_type, it is the relationship(s) between message and domain, as claimed by the use of a domain name in a message. In case publicsuffix and save_from_anyway are both disabled --not a recommended setting--- only authenticated relationships are flagged.

Relationship flags are expressed as a comma separated list of the following tokens:

The author domain is the one appearing in the From: field.
"spf_helo", "spf", and "dkim"
The corresponding authentication methods, whose results are given in separate variables. SPF results originate from Courier's BOFHSPFHELO and BOFHSPFMAILFROM.
The domain is the organizational domain of the author domain, retrieved using the Public Suffix List.
The domain where the DMARC record was retrieved from.
The domain is aligned with with the author domain, in the sense that they share the same organizational domain.
The domain is whitelisted in the configured zone. This result originates from the -allow option of couriertcpd. See couriertcpd(8).
The domain, either the author domain or the organizational domain thereof, does not actually exist because NXDOMAIN was returned on querying its Name Servers.
A rewritten digest of the DMARC record, used for aggregate reports. Targets and interval are moved to their own variables, so this digest shouldn't take more than 60 bytes, including the appended sentinel (see "dmarc_rua" below).
A multiple of original_ri that matches the honored_report_interval.
The Report Interval requested by the domain.
The number of bytes that must be removed from the domain name in order to obtain the organizational domain name, as resulting from the Public Suffix List.
A rewritten subset of aggregate report targets. Only "mailto:" targets are considered, and stored as a list of mail addresses.

A sentinel is appended to the list, to know if excessive length caused truncation. In that case, the DMARC record is looked up anew. Even if DB queries are usually faster than DNS ones, reserving too much room might degrade performance, so a compromise length that fits most domains could be preferred.

Sometimes called spf_result, it is the SPF result, if the domain appears in any of helo, mailfrom, or From:. SPF results originate from Courier's BOFHSPFHELO and BOFHSPFMAILFROM.
This is set to 1 if any DKIM signature from the given domain verified. 0 otherwise. Note that a variable with the same name but different content is available for db_sql_insert_msg_ref.

incoming message queries

These queries run right after a zdkimfilter process yields control of the message back to Courier. They run in the order given below:

This query can use incoming message variables only, not the domain ones. It can return a message_ref, a single value that will be available to subsequent queries. Since Courier's triplet ino, mtime, and pid can be used as a unique key to the message table, it is not imperative that the key of the last inserted message be returned. However, the example uses a second statement to select the last inserted ID as a return value.
This query can be used to check whether a domain name exists already. It can return the key --one row, one column--- which will be made available as variable domain_ref.

The reference example maps this query to a stored procedure which takes care to insert the domain if it isn't found.

This query runs only if the previous one did not return the key. It can return the key too. If it does not, db_sql_select_domain runs for a second time right after this.

Be wary of double insertions, as multiple filters can be active at the same time. The example does not use this query.

This query runs only if the first invocation of db_sql_select_domain did return a domain_ref. That is meant to update the domain record which existed already.

The example does not use this query, since the stored procedure used to select or insert does the necessary updates as well.

This is the last query in the domain loop. It is used to insert msg_ref records. If a message was signed multiple times by a given domain, this query is executed multiple times, one for each signature. If a message has no signatures, the query is still executed once, to link the domain to the message. It has the following additional variables:
Sometimes called dkim_result, it is the verification result proper, one of none, pass, fail, policy, neutral, temperror, permerror. Note that a variable with the same name but different content is available in the domain calls.
A tiny integer that is non-zero if a signature was recovered after undoing a mailing list manager transformation.

Aggregate reports can output a human_result like "through MLM transformation" when this datum is present for a signature.

The 1-based position of this domain's signature(s), preferred domains first.

Ordering is done before verifying signatures, and it is reported to domains which accept DMARC aggregate reports.

The signature selector, "s=" tag.
The value returned by db_sql_insert_message
The value returned by either db_sql_select_domain or db_sql_insert_domain.

Storing outgoing messages synopsis


db_sql_check_user in action, as displayed by Thunderbird (above) and Courier webmail (below) clients.

Limiting can be achieved by imposing a fixed limit, for example like so:
db_sql_check_user SELECT IF ((SELECT SUM(rcpt_count)\ FROM message_out WHERE user = $(user_ref) AND\ mtime > UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY)) > 500,\ 'Block this guy', 0) AS reason

More complex settings, possibly involving user interactions with web forms are possible.

These queries consist of an initial query, a loop for each target domain, and a final query. They can use the common variables defined above, and three additional ones.

outgoing message variables

The number of envelope recipients that the message is to be sent to.
This is either the user domain, in db_sql_select_user and db_sql_check_user, or one of the domains where the message is destined (target domains).
A flag that likely identifies the message as a complaint:
bit 0 (complaint_flag & 1)
The envelope sender is empty.
bit 1 (complaint_flag & 2)
The From: field is a common address; that is, one of ``postmaster'', ``abuse'', ``info'', ``sales'', ``list'', ``support'', ``noc'', ``marketing'', ``security'', ``hostmaster'', ``webmaster''.

outgoing message queries

This can additionally use local_part and domain, which identify the authenticated user who sent the message. It is the first query called for each message. It can return two values, that is one row with two columns, that will be available as variables user_ref and message_ref to the remaining queries.
This query selects a target domain and returns a domain_ref. This and the following queries occur once per target domain; that is, for each of the domains that appear in one or more of the message's recipients.
This query is called only if no domain_ref was selected by db_sql_select_target. It is meant to insert a domain and possibly returns domain_ref directly. If it returns nothing, then db_sql_select_target is called a second time with the same domain.
This query is called only if domain_ref was obtained by the first call to db_sql_select_target; in that case, db_sql_insert_target is not called.
This completes the target loop. It is called with both domain_ref and message_ref and provides the many-to-many link between the targeted domain and the user.
This query is called last for outgoing messages. It can use local_part and domain as well as user_ref variables to identify the user. It is meant to check whether a user's activity is normal or exceedingly high, such as submitting too many messages for too many recipients, from too many IP addresses.

If the query returns anything that is not 0 or NULL, and blocked_user_list is defined, then the filter appends the user-id to that list, quoting the returned string as a reason.

As the result depends on a query, it can as well consider a field that users can adjust from a web form in order to temporarily increase their limits.


Retrieving values for DMARC aggregate reports

This last group of queries is only used by zaggregate. Two queries are expected to return multiple rows and columns with the data that was stored from incoming messages; columns here are named as the corresponding variables there. A third query updates domain timestamp. The three of them --one per subsection--- are described below, although zfilter_db, as a command, simulates filtering only, not reporting. For the rest of the story see zaggregate(1).

DMARC aggregate domain query

The query db_sql_dmarc_agg_domain selects the domains that deserve an aggregate report. It can use two variables:

The duration, in seconds, of the reporting interval. It is the value configured as honored_report_interval, possibly forced to be a submultiple of 86400 (24 hours).
The end of the period, in seconds since the Unix epoch, UTC. It is the time of the current zaggregate invocation, rounded downward to a multiple of period.

This query is expected to return six columns, the first two of which are essential, in the order given:

Domain key, passed as-is to the next query.
Domain name, as written in the aggregate report. This name is also used to look up the following three "dmarc_r*" values, in case they are missing or corrupted, and to validate external report targets.
The end period of the previous report, 0 if it is the first report to this domain. This value is used to skip a domain if its report is due at a future interval, to avoid sending already sent data, and the like.
The (adjusted) report interval for this domain.
The report targets.
The DMARC record, used to report the policy published by the domain.

DMARC aggregate record query

The query db_sql_dmarc_agg_record selects the records, or rows, to be reported for a given domain. It can use four variables:

Domain key, as obtained from the previous query.
Domain name.
The beginning of the period being reported, in seconds since the Unix epoch, UTC.
The end of the period, as above.

This query is expected to return 11 + N*4 columns, where N is the number of DKIM signatures reported. The order of columns is as given below (parenthesized names refer to XML tags in the report):

Where messages in this row originated from ("source_ip").
Number of messages that this row covers ("count").
Message's overall DKIM evaluation ("dkim").
Message's overall SPF evaluation ("spf").
The policy-override reason detailed above ("reason/type"). The XML schema provides for a comment to supplement the reason ("reason/comment"). To add it, put it after the reason but in the same column, separated by a space. That can be done, for example, using MySQL's CONCAT function.
The author's domain ("header_from"). Of course, it turns out to match the DMARC domain, but doesn't have to be exactly equal.
spf domain 1
The domain of envelope sender in scope mfrom ("envelope_from" and possibly "spf/domain"). Only one SPF domain/result pair is reported. Preferably this one, since it is the most widely used SPF authentication scope.
spf result 1
The corresponding result ("spf/result").
spf domain 2
The domain of HELO/EHLO verb, in scope helo ("spf/domain"). This is the only official SPF alternative to envelope sender. zaggregate reports this second pair of SPF values only if it is a pass while the first one is not, or if the result is similar but this domain is better aligned with From: than the first domain.
spf result 2
The corresponding result ("spf/result").
dkim domain N
The Nth signing domain ("dkim/domain").
dkim selector N
The selector used to sign ("dkim/selector").
dkim result N
The corresponding result ("dkim/result").
dkim transformation N
An integer, if positive signals that the MLM transformation succeeded ("dkim/human_result").

The number N of DKIM pairs may vary from a row to another. If either member is NULL the pair is skipped.

Check recipient address

The query db_sql_check_dmarc_rcpt checks an email address, which is available as a variable. Returns zero on good addresses.

The only variable is the address to be checked.

The email address given in "rua=" (possibly overridden).

Set DMARC aggregate query

The query db_sql_set_dmarc_agg updates the last_report using the value of period_end. For practical reasons, it can use exactly the same four variables as db_sql_dmarc_agg_record, but is not expected to return anything.  


Now for the options to run zfilter_db, to add fake records to the database. Some options expect an option-arg mentioned in the synopsis are command line arguments. Any option-arg cannot start with a minus. They're as follows:
Increase verbosity during query processing. Can be given two or three times to obtain further details.
-f config-filename
Specify the configuration file.
Report the "db_" configuration options only.
Print usage and exit.
Report package version string and exit.
Don't actually run queries, just print them out. OpenDBX is used normally, including escaping arguments, but the odbx_query call is skipped.
Force the "test" backend. OpenDBX is not used at all. The list of allowed variables is printed along with their available values, if any. Using this option is the same as setting db_backend to "test".
--db-sql-whitelisted domain ...
Run the query for each domain. An option --that is, an argument starting with "-"-- marks the end of the domains.
--db-sql-domain-flags domain ...
Run the query for each domain. An option --that is, an argument starting with "-"-- marks the end of the domains.
--set-stats dir [message data|domain[/token]]
Specify message data. The dir argument consists of a single character, either "i" for incoming or "o" for outgoing. The arguments after that are interpreted as message data if option --set-stats-domain is present, or as domain[/token] otherwise.

Up to 16 arguments are accepted for message data:

Either the IP number for incoming messages, or the authenticated user-id for outgoing ones.
received_count for incoming messages, rcpt_count for outgoing ones.
signatures_count for incoming messages, complaint_flag for outgoing ones.
A dot-separated joint of ino, mtime, and pid

Missing arguments are generated at random, so as to insert different records. A "@" can be used as a placeholder for a missing argument not at the end. However, domain names have to be supplied explicitly, otherwise the call fails with a critical error.

--set-stats-domain domain[/token[:value]]... ...
This option must be placed after --set-stats. It marks the end of message data and starts domain arguments. If the option is not given, message data is assumed to be empty and the arguments are interpreted as domains.

Domain names are given one per argument. Each domain name can be followed by a slash-separated list of tokens, some of which can take a value. Spaces must be omitted, escaped, or quoted so as to have one argument per domain. Valid tokens are:

Identifies an author's domain.
Identifies an envelope sender's domain (mfrom). SPF result can be given as value.
Identifies a helo domain. SPF result can be given as value.
Identifies a signing domain. DKIM result can be given as value.
"org", "aligned", and "nx"
Set corresponding flags, no value.
Identifies a dmarc domain. DMARC record can be given as value.

When using the --test or --dry-run options, the calling function won't get any result from queries. Hence, some statements are called multiple times.  


The distributed file odbx_example.conf contains an example configuration which works with MariaDB.

Example database tables and procedures are defined in odbx_example.sql.

For DMARC, file odbx_example.sh contains a few calls to zfilter_db meant to provide some data to test zaggregate.

The following examples work in such environment.

Check what happens receiving a message:

  $ zfilter_db -f test.conf --set-stats i example.com

Same as above, with a DKIM signature:

  $ zfilter_db -f test.conf --set-stats i example.com/dkim

The two commands above only insert a record in message_in. To simulate a real message, one can use something like the following:

  $ zfilter_db -f test.conf \
    --set-stats I @ bounces@s.example.org ugo@s.example.org \
    --set-stats-domain \
      example.org/org/aligned/dmarc:'v=DMARC1; p=reject; ri=300;' \

Besides inserting a message_in record, the latter command inserts or updates the domain and generates as many msg_ref records as needed.

It could result in the following:

  MariaDB [zdkimfilter]> SELECT m.ip, FROM_UNIXTIME(m.mtime) AS time,
      -> d.domain, r.dkim, r.dkim_order AS o 
      -> FROM msg_ref AS r, message_in AS m, domain AS d
      -> WHERE r.domain=d.id AND r.message_in=m.id
      -> ORDER BY m.id DESC limit 3;
  | ip               | time                | domain        | dkim | o |
  | ::ffff: | 2022-05-13 19:31:18 | s.example.org | pass | 1 |
  | ::ffff: | 2022-05-13 19:31:18 | s.example.org | pass | 2 |
  | ::ffff: | 2022-05-13 19:31:18 | example.org   | none | 0 |
  3 rows in set (0.001 sec)

Does the user check function work? To test it, we simulate an outgoing message with 8000 recipients. Since rcpt_count is accepted at position 9, we use "@"s placeholders and let the test program assign those values for us:

  $ zfilter_db -f test.conf --set-stats-domain example.org \
  > --set-stats o user@example.com @ @ @ @ @ @ @  8000

The bottom line of the output should report ``user check: 0'' until the configured limit is reached.

We can populate the test database as needed, in order to design and test our procedures and scripts. But what variables can we use for a given query? Just ask:

  $ zfilter_db -f test.conf --test --set-stats i example.com,dkim

  Variables allowed for statement db_sql_insert_message:
  ino: 9f5ae9e
  mtime: 627f7c93
  pid: 186e
  ip: ::ffff:
  date: Sat, 14 May 2022 11:55:31 +0200
  envelope_sender: bounce.address@example.com
  from: sender@example.com
  subject: Subject of the msg
  message_id: <2307a5eb@example.com>
  content_type: multipart/mixed
  content_encoding: -- not given --
  received_count: 2
  signatures_count: 1
  mailing_list: 1
  adsp_flags: -- not given --
  dmarc_dkim: none
  dmarc_spf: none
  dmarc_reason: sampled_out
  dmarc_dispo: reject
  message_status: accept

(Never mind whether dummy values are given and, in case, their value.)  


Default configuration file.


Alessandro Vesely <vesely@tana.it>  


RFC 6376
DomainKeys Identified Mail (DKIM) signatures.
RFC 7489
Domain-based Message Authentication, Reporting, and Conformance (DMARC).
RFC 6650
Creation and use of email feedback reports: An applicability statement for the Abuse Reporting Format (ARF).
odbx_init(3) and odbx_bind(3)
Consider ODBX documentation.
odbx_example.sql and odbx_example.conf in the tarball
There is an example of how to set up database logging as described here, using MySQL.

Copyright © 2012-2024 Alessandro Vesely