Skip to main content

Welcome to Mark Baggett - In Depth Defense

I am the course Author of SANS SEC573 Automating Information Security with Python. Check back frequently for updated tools and articles related to course material.




New tool Freq_sort.py

I read an article on Fireeye's website the other day where they used Machine Learning to eliminate a lot of the noise that comes out of tools like strings.  It's pretty interesting and looks like it would save me some time when looking through malware.

https://www.fireeye.com/blog/threat-research/2019/05/learning-to-rank-strings-output-for-speedier-malware-analysis.html

I wondered how effective freq.py scores would be in helping to eliminate the noise.  45 minutes and 29 lines of Python code later I have something that looks like it works.  Check out freq_sort.py.

Before freq_sort.py here is the output of strings on a piece of malware:

student@573:~/freq$ strings -n 6 malware.exe | head -n 20
!This program cannot be run in DOS mode.
e!Rich
`.rdata
@.data
.pdata
@.gfids
@.rsrc
@.reloc
\$0u"H
L$ SVWH
K SVWH
|$ H;_
<bt%<xt!<Zt
|$ AVH
l$ VWAV
L$ SUVWH
UVWATAUAVAWH
0A_A^A]A\_^]
UVWATAUAVAWH
@A_A^A]A\_^]


After freq_sort.py the useful stings quickly bubble to the top.  Its not perfect but the frequency tables are not tuned for EXE's.  Some binary based frequency tables will yield better results.

student@573:~/freq$ strings -n 6 malware.exe |python3 freq_sort.py | head -n 20
Failed to convert Wflag %s using mbstowcs (invalid multibyte string)
Failed to convert pypath to ANSI (invalid multibyte string)
Failed to convert pyhome to ANSI (invalid multibyte string)
8y@:"#@<*.
WARNING: file already exists but should not: %s
opyi-windows-manifest-filename freq_server.exe.manifest
Failed to get address for PyMarshal_ReadObjectFromString
INTERNAL ERROR: cannot create temporary directory!
Failed to get address for Py_FileSystemDefaultEncoding
Failed to convert executable path to UTF-8.
Failed to get address for Py_NoUserSiteDirectory
Cannot allocate memory for ARCHIVE_STATUS
Failed to get address for PyString_FromString
Failed to get address for PyString_FromFormat
Failed to get address for Py_IgnoreEnvironmentFlag
Failed to get address for PyUnicode_FromString
Failed to get address for PyObject_SetAttrString
Failed to convert progname to wchar_t
Failed to get address for PyUnicode_FromFormat
Failed to convert %s to ShortFileName


Give it a try and tell me what you think.  If you find it useful or would like some features added send me a note.

https://github.com/MarkBaggett/freq/blob/master/freq_sort.py

Mark

Popular posts from this blog

Awesome Keyboard Tricks - Clevo/Sager Backlight control from Powershell

I'm back on Windows.   After 8 years on a Macintosh I just couldn't go another day with ONLY 16GB of RAM.   I priced it out and for the cost of a top of the line MacBook I could get a tricked out PC with 32GB of ram and 2.5 TB or hard drive space (1.5 of it being SSD).   So I made the switch.  To get a top performing laptop I ended up buying a gaming machine from xoticpc.com.   The model is Sager NP9752 ( Clevo P750ZM ).    I have to say I like it quite a bit.    One of the features I was curious about was the "Programmable backlit keyboard".   With it you can set your keyboard backlight to various colors and light movement patterns.    Now, when I hear "programmable" I think APIs.   I was a little disappointed to find out there weren't any documented APIs that I could use to control the keyboard.    Your only choice is to use their built in tool to configure the lights on the keyboard.   That stinks.  I want to be able to change key colors automatically

SRUM-DUMP and SRUM_DUMP_CSV Ported to Python 3

SRUM_DUMP and SRUM_DUMP_CSV have been ported to Python3 and are available for download from the PYTHON3 branch of my github page. https://github.com/MarkBaggett/srum-dump/tree/python3 In moving to Python3 I also updated the modules that I depend upon to parse and create XLSX files and access the ESE database that contains the SRUM data.  I hope that this will fix the issue that some users have experienced with SRUDB.dat files that create very large spreadsheets.  If it does not please let me know and continue to use SRUM_DUMP_CSV.EXE to avoid the XLSX problem. In moving to Python3 you will find the process to be faster. If you would like to run the tools from source instructions for doing so are in the README on the github page.

Security Onion getting the most from Freq.py and Domain States

My talk at Security Onion conference has been posted and is available for viewing here.

SRUM DUMP and SRUM DUMP CSV Updated

An issue was reported where is some conditions SRUM_DUMP would stop processing and print the following error to the screen. UnboundLocalError: local variable 'sid_str' referenced before assignment The issue was that sometimes the SRUM database had entries in it that were all zeros. OrderedDict([('IdType', 3), ('IdIndex', 38127), ('IdBlob', '0000000000000000')]) I've released an update that handles the anomoly althought I do not understand the circomstances of why Windows would record all zero's for as the user SID. The issue was fixed and new versions of both SRUM DUMP and SRUM DUMP CSV were released.