?

Log in

No account? Create an account

Previous Entry | Next Entry

I was looking up things about virus research following a discussion with John Mark Agosta about cybersecurity and uncertain reasoning, and came across this article.

Hsu, W. H. (1992). Generic Virus Detection. MacTech: The Journal of Macintosh Technology (formerly MacTutor: The Macintosh Programming Journal), 8(2).

Abstract

Recently, computer viruses have attracted a high volume of public, media, and scientific attention. This is no surprise considering the explosion in the development rate of computer virus code. Combine this with the fact that current methods for detection of viruses have had limited success. A new approach to the detection of such code is needed. We're going to look at two variations on algorithms originally developed for string matching. Both modifications allow an increased tolerance for variant “strains” of known viruses, especially for the so-called evolutionary class which mutate themselves at predetermined intervals.

First, a randomization scheme can be applied to an established fast substring matching procedure (such as the Boyer-Moore algorithm). This randomization allows mutation-resistant searching. Second, an approximate pattern matching algorithm for a maximum number of differences can be used. The algorithm is modified by weighting the edit distance metric to make it robust to character padding and removal. Both functions are combined to create a generalized detector capable of finding viral clones, whether produced by human authors or by automatic variations.


Edit, 09:30 CST - That's not a refereed paper, which is why it doesn't appear on my publications list, but I could probably have gotten it into a security conference if I'd included some of the experiments with the real viruses of the day (SCORES and the HPAT, nVIR strains). I was 16 when I did the project, 17 when I wrote it, and 18 when it was published, so I wasn't yet clear on what the standards of academic peer review were.

Edit, 10:35 CST - The subject is still timely, and I believe the methodology may be as well, albeit in limited cases. I've learned some things about string matching, randomized algorithms, and cryptography since then. The question in this case is whether string-matching approaches to virus detection still has any relevance, compared to cryptographic checksums and security-based techniques such as managing information flow and formally verifying and certifying properties.

I'm a strong believer in defense-in-depth, so insofar as formal "integrity shells" and secure kernels are not in place on every desktop, I think string matching scans can provide a first line of defense against viruses whose signature has not yet been broadcast (via Symantec AntiVirus LiveUpdate's "Avenge Microdefs" and McAfee's Anti-Virus). There's a critical window before a CERT team dissects the virus and people "get the fix".

Suppose every version of (say) Windows Vista had a comprehensive checksum-based kernel component that ran as a background process or thread and could be enabled from the Control Panel the way privacy options are set in Internet Explorer's Tools -> Internet Options. In that case, string-matching would be largely unnecessary and disk scanning could be based on signatures, leaving only the evolutionary viruses to contend with.

The problem is that, like IE's privacy options for cookies, etc., people generally either leave this at the default (medium-high?) or set it to the most convenient level they will need once they run into hassles. So even if Vista had a good trust model, people probably wouldn't use it much.

--
Banazir

Comments

( 5 comments — Leave a comment )
atomistictheist
Aug. 21st, 2005 01:23 pm (UTC)
w00t!
banazir
Aug. 21st, 2005 03:34 pm (UTC)
Virus detection, cybersecurity research, and "journal papers"
As I mentioned above, that's not a refereed paper, which is why it doesn't appear on my publications list, but I could probably have gotten it into a security conference if I'd included some of the experiments with the real viruses of the day (SCORES and the HPAT, nVIR strains). I was 16 when I did the project, 17 when I wrote it, and 18 when it was published, so I wasn't yet clear on what the standards of academic peer review were.

Sometimes I think about reviving some of these research projects - the subject, at least, is still timely, and I've learned some things about string matching, randomized algorithms, and cryptography since then. The question in this case is whether string-matching approaches to virus detection still has any relevance, compared to cryptographic checksums and security-based techniques such as managing information flow and formally verifying and certifying properties.

I'm a strong believer in defense-in-depth, so insofar as formal "integrity shells" and secure kernels are not in place on every desktop, I think string matching scans can provide a first line of defense against viruses whose signature has not yet been broadcast (via Symantec AntiVirus LiveUpdate's "Avenge Microdefs" and McAfee's Anti-Virus). There's a critical window before a CERT team dissects the virus and people "get the fix".

Suppose every version of (say) Windows Vista had a comprehensive checksum-based kernel component that ran as a background process or thread and could be enabled from the Control Panel the way privacy options are set in Internet Explorer's Tools -> Internet Options. In that case, string-matching would be largely unnecessary and disk scanning could be based on signatures, leaving only the evolutionary viruses to contend with.

The problem is that, like IE's privacy options for cookies, etc., people generally either leave this at the default (medium-high?) or set it to the most convenient level they will need once they run into hassles. So even if Vista had a good trust model, people probably wouldn't use it much.

--
Banazir
zaimoni
Aug. 21st, 2005 06:31 pm (UTC)
Re: Virus detection, cybersecurity research, and "journal papers"
Reminds me of one of my deep-freeze never started projects: a generic antiviral system for my primary. [This is insurance against the ultimate failure of tech support for AV systems; right now, ClamAV and F-Protect are the only ones that support Win95 at all. I'm also keeping an eye out for when open-source firewalls become viable Windows-side.]

At the moment, the "concept" is a series of write-protected bootable FreeDOS floppies, possibly with command-line Windows programs as well. [Yes, I have W2K on FAT32 for exactly this contingency. Lifesaver.] Obviously, I twang the CMOS configuration around to boot from floppy rather than HD when I want to run these.
poovanna
Aug. 21st, 2005 11:56 pm (UTC)
:-O
I was 16 when I did the project, 17 when I wrote it, and 18 when it was published, so I wasn't yet clear on what the standards of academic peer review were.

Seriously. Wow!
banazir
Aug. 23rd, 2005 07:14 pm (UTC)
Eh, tis nothin
I had a bit of a head start, plus I got lucky.

--
Banazir
( 5 comments — Leave a comment )

Latest Month

December 2008
S M T W T F S
 123456
78910111213
14151617181920
21222324252627
28293031   

KSU Genetic and Evolutionary Computation (GEC) Lab

Teunciness

Breakfast

Science, Technology, Engineering, Math (STEM) Communities

Fresh Pages

Tags

Powered by LiveJournal.com
Designed by Naoto Kishi