MultiScanner: A Tool to Help Work the Malware Analysis Queue

May 19, 2015
CND Tools: Post by Drew Bonasera

Malware analysis can sometimes seem like an exercise in hurry-up-and-wait, as analysts run samples through a variety of tools and wait for output to be able to compare results. By running such tools in parallel, MultiScanner—a MITRE-developed CND tool recently released as open source on GitHub—produces a single report combining the results from each tool in the toolbox, thus enabling analysts to more efficiently manage the queue, while also improving the workflow among the entire team.

Malware Results Aggregator

MultiScanner takes one or more samples as input and, depending on the configuration of available tools, analyzes each file in turn using each tool. The results are collated to produce a human-readable JSON file easily consumed by other tools, such as ElasticSearch and custom scripts. (Note: native integration with different storage systems, such as ElasticSearch and MongoDB, is in the works.)

MultiScanner includes a default set of built-in tools, as well as hooks to external malware analysis tools that may require separate licenses. The default includes anti-virus (AVG 2014, Kaspersky 15, McAfee, and VirusTotal), file hashing (MD5/SHA256, ssdeep), metadata extraction (Exif, PEFile, TrID, Libmagic, Pdfinfo), detonation chambers (FireEye AX, Cuckoo Sandbox), and custom signature findings (Yara). If the application is not licensed, such as for FireEye or McAfee, that portion of MultiScanner will quietly "fail," skipping over these tool runs.

Malware Analysis Framework Setup

As an open source project, MultiScanner requires configuration and initialization to your own environment. The core framework is released under the Mozilla Public License, so you are free to modify it to suit your needs. If you develop a new module that is external to the core MultiScanner code, you can release it under a license of your choosing, should you want to do so. To generate the default configuration, after cloning the repository, just run python MultiScanner.py init.

MultiScanner can be used on a single server running either Windows or Linux. However, some tools only run on one platform or the other. For this reason, we recommend running MultiScanner primarily on a Linux host and setting up a separate Windows host (referred to as the analytic machine in the documentation) to execute some of the remaining modules. Secure Shell (SSH) is used to communicate with the Windows host, and the network share path must be identified in the configuration file. We use FreeSSHd on Windows to accomplish this connection. MultiScanner copies malware samples to the share, then runs the specified analytic tool on the Windows machine, with its resulting output pulled back for final processing and inclusion in the report.

The majority of modules included in MultiScanner are hooks to external applications. When configuring your version, be sure that anti-virus or similar applications do not interfere with or prevent the MultiScanner tools from accessing the malware samples. Here is a list of modules that are included and their respective configuration options.

Snippet of Results

By way of example, here's the output from a run of MultiScanner on a malicious PDF file that shows the JSON formatted results from AVG 2014, ExifTool, Kaspersky AV, MD5, SHA256, and TrID. While the example shows the output from only a few tools, an analyst will quickly realize that the PDF contains malicious javascript.

{
    "/malware/pdf/VirusShare_03b0d0ae2cc1c90e08f6c08ee65ba5e5.pdf": {

        "AVG2014": "Script/PDF.Exploit.CU",
        "ExifTool": {
            "FileSize": "84kB",
            "FileType": "PDF",
            "Linearized": "No",v
            "MIMEType": "application/pdf",
            "PDFVersion": "1.6",
            "Warning": "Invalidxreftable"
        },
        "Kaspersky": "Exploit.JS.Pdfka.fkc",
        "MD5": "03b0d0ae2cc1c90e08f6c08ee65ba5e5",
        "SHA256": "8091de66e9b9ac1949f49a8cc2886c1ab691a157f3b49bd1cb9c8e11624755c4",
        "TrID": [
            [
                "100.0%",
                "AdobePortableDocumentFormat",
                ".PDF"
            ]
        ]
    }
}     

Improving the Malware Analysis Workflow for Threat-Based Defense

One motivation for writing MultiScanner was to create a hands-off, parallel analysis environment that could handle one or more files of interest, so that malware analysts could better anticipate work "in the queue." The entire team can manage time and workload in a more efficient manner knowing that a multiplicity of results can be ready for consumption when an analyst is ready for an in-depth review.

The structured report produced by MultiScanner can simplify results correlation when used in conjunction with searchable tools such as Elasticsearch. The structured report also makes possible timely, actionable signatures for defense. Any JSON consumable tool or analysis framework could make use of the malware metadata produced by MultiScanner in order to streamline the triage process and improve the workflow.

We Want to Hear From You

MultiScanner is an open source project, and we welcome community contributions. If you are thinking of using or adapting it to your organization, we'd like to hear from you. We released MultiScanner to the community in order to provide a common framework to automate the malware analysis work queue and provide flexibility in using results into other frameworks and tools.