September, 2003

Simulating Network Worms

Introduction

This document describes the design and implementation of a network worm simulation system, named "NWS". It also describes using the NWS system to model aspects of real world worms, and the results of those simulations.

Definition of "network worm"

Currently, it seems unfashionable to strictly define "worm" or "virus" or any other perfectly good term. I think that commercial interests drive this - it's simpler and more profitable to threaten non-technical people with a "worm" than it is to address the real problems of a software and hardware monoculture. For purposes of defining the scope of NWS, I wanted to define "network worm".

For the purposes of this document a "worm" consists of a process or set of processes that replicates without human intervention by creating an executing copy of itself on another computer via some form of network communication.

Other similar definitions exist.

The Jargon File entry for worm says^[1]: A program that propagates itself over a network, reproducing itself as it goes.

Alexander Bartolich^[2] of the ELF virus writing HOW-TO has this definition: A worm is a program that penetrates other running programs. Penetration means to copy the executable code of the worm into the active process image of the host.

The definition seems to describe other forms of "mobile code" like Java applets. One has to realize that a worm spreads itself - a user has to ask to download a Java applet, so applets don't count as worms. Email viruses like Klez or SirCam also require human intervention. Before the virus activates, the user has to at least open the email carrying the virus, and possibly even open the attached document that contains virus code. The typical mass-mailer virus doesn't qualify as a worm by the definition above, despite the "worm" names given to mass-mailing viruses by anti-virus companies and mass media. Cliff Changchun Zou's work on email virus modeling^[3] shows the differences between mass-mailer viruses and network worms.

The clause that says "creating an executing copy of itself ... via some form of network communication" allows us to distinguish between a fork bomb^[4] and a worm. A fork bomb consists of many replicating processes on a single machine, without recourse to network communication.

Motivation for simulation

Various reasons exist for simulating the spread of network worms:

Help reverse-engineer an existing worm
Design counter-worms^[5] or beneficial worms^[6]
Design a worm that patches a security hole^[7]
Understand effects of worms on networks
Determine behavior of possible future worms

My own personal motivation consists of puzzling over the "installed base" theory. I usually encounter this theory in discussion forums like Usenet or Slashdot^[8]. The theory runs like this:

Microsoft Product X has more worms than open source Product Y because Product X runs on more machines than Product Y.

I originally wrote the NWS simulator to try to test the "installed base" theory. In the course of writing NWS, I've decided that the "installed base" theory doesn't have any meaning, for at least two reasons.

For some products (IIS and Apache), the theory is false - Apache runs on about twice^[9] the number of web servers that IIS runs on.
For other products (Outlook/Exchange), one can't falsify the theory. Unless some miracle caused reversed installed bases, no way exists to test the theory.

The installed base theory amounts to special pleading for the use of Microsoft products.

Other Simulators

Staniford, Paxson and Weaver^[10], give sparse details on simulating a "Warhol worm".

Moore, Shannon, Voelker and Savage^[11] seem to have modeled the Code Red epidemic with some effects of containment measures. They do not explain their simulator at all.

Zou, Gong and Towsley ^[12] modeled Code Red propagation, in an attempt to better match observed behavior. Apparently, they did their modeling using numerical solutions to differential equations.

Michael Liljenstam has written SSF.App.Worm^[13], a module for the Scalable Simulation Framework^[14].

Kephart and White ^[15] wrote a simulation of email virus spread. The simulator models networks with limited connectivity. They didn't fall prey to referring to such viruses as "worms".

Zou, Towsley and Gong ^[3] modeled email virus propagation using numerical simulations. They also make a distinction between "network worm" and "email virus".

NWS function

NWS provides a Perl programmer with a framework in which to write a complete network worm simulation. The programmer writes a Perl program using NWS objects and their methods. Running the program executes the simulation. NWS does require a degree of Perl knowledge, especially to write the worm code. It does not allow a user to eschew all intelligence by limiting him or her to actions or categories of actions I've already thought of.

Download

You can download the NWS simulator and try the simulations described in this document or use NWS to write your own. I developed NWS in a Linux command line environment, so the simulations print to STDOUT. Windows users may have a hard time getting these simulations to work, but they should work unchanged on all modern Unix, Linux or BSD systems that have Perl installed.

Licensing

The NWS network worm simulator is licensed under the GNU General Public License^[16] (GPL).

Framework Design

I wanted to write a system that allowed simulation of many different real-life worms. When I began, I did not understand what this entailed. At a minimum, I believed that I needed flexible configuration and setup, to allow trying different population sizes (including number and "species" of worms) and address space sizes. I also chose to have NWS actually execute "worm code" in each entity a worm infects. Actually executing code allows a worm to perform arbitrary actions just like in real life. I wanted to make any given simulation as believable as possible.

Between setup and configuration and executing code for a worm, any simulator needed to include some kind of interpreted language. I chose Perl, mainly because I had an easy time writing early versions of an SIS epidemic in Perl. I wrote the NWS framework itself in the same language that NWS users write worms.

I also chose to write NWS in a simplistic object oriented fashion. Objects in an NWS program directly correspond to real life entities that comprise a real life network: hosts, software, and messages. This should make simulations much more believable, since a user can directly count uninfected hosts or worm hosts or messages passed.

Simulated worms contain actual code

Petri nets, or finite state machines or numerical simulations could simulate any given worm, but not allow the curious investigator to make simulations of some other type of worm, or more than one strain of worm in a simulation.

Numerical simulations don't easily allow discontinuities like dormant stages in a worm's life cycle, or introductions of counter-worms after worm infections reach epidemic proportions.

One can write a simulator for any given worm without executing "worm code", but that simulator wouldn't have any kind of general purpose capability.

Real life network worms execute arbitrary code once they've replicated themselves on a new host. The only way to model this generality allows simulated worms the same ability: simulated worms must execute some form of code. The NWS framework allows and encourages this.

Simulators that operate without executing worm code will end up corroborating specific cases written for and executed in a general purpose worm simulator.

Fully-connected network

NWS simulations model fully-connected networks, using "network" in a more graph theoretic sense rather than a TCP/IP sense. Every network-connected entity can send a message to any other entity in the simulation. Each network-connected entity has an address consisting of a single number. Writing a simulation requires an NWS user to set the address space size, the largest number that gets used as an address.

In contrast, mass-mailing Outlook viruses don't operate in a fully-connected network. Many mass-mailing Outlook viruses use Outlook's address book or SMTP client, while a few look through the infected machine's file system to find email addresses. Modeling a mass-mailer virus would involve deciding how many email addresses to allot to each instance of the virus, and deciding on a probability that represents how often a virus recipient clicks on the virus attachment. Not every instance of a mass-mailer virus would send itself to all the email addresses in the simulation, so the network of email addresses would not qualify as fully-connected.

Assuming fully-connected networks seems overly optimistic relative to the real Internet, given work such as Arbor Networks paper on dark address space ^[17] and the prevalence of corporate intranets connected to the Internet via gateways and firewalls that do network address translation.

Object oriented design mimics real networks

Naively, a network links entities called hosts. A host can contain instances of software that the host executes periodically. The network allows hosts to communicate with each other via messages or streams of messages.

The NWS framework demands that its users create objects that correspond directly to entities in a real network. NWS users have to create an instance of class NWS::Network, and populate the instance of NWS::Network with instances of class NWS::Host. NWS::Host instances contain NWS::Software instances.

Due to CPU architecture or operating system executing on the host, or even due to particular versions of software executing under the operating system, real life worms can only replicate themselves on certain hosts. Users of NWS can create instances of class NWS::Host that specify CPU architecture or operating system name. Instances of NWS::Host can contain instances of class NWS::Software that specifically name the "software" that the Host executes.

When NWS::Host instances receive NWS::Message instances, the NWS::Host delivers NWS::Message instances to NWS::Software instances. NWS::Host instances have numerical addresses in a network, analogous to Internet Protocol addresses. Inside an NWS::Host, names of NWS::Software instances constitute the analog to TCP or UDP port numbers. An NWS::Message gets delivered to a host by numerical address, but the Message gets ignored if the Message instance does not contain an identifier that matches the identifier of the Host instance receiving it. After a match on Host identifier, the Host will ignore the Message instance if the Message does not name a Software instance the Host contains.

This style probably has some detrimental effect on execution speed. I hope that this style promotes more believable simulations.

Use of Perl for both Framework and Worms

Users of NWS write simulation programs in Perl, incorporating and using NWS as a set of library modules. Users also write worm code for their simulations in Perl. The same programming language gets used for setup, configuration and for subjects under study in the simulation.

One possible alternative would have the simulator framework (network, network communications, execution scheduling, etc) written in a traditional compiled language like C. The worm code would exist as fragments of a scripting language, probably an easily embeddable scripting language like Tcl^[18], Lua^[19] or S-lang^[20]. Other general simulators, like the ns^[21] network simulator, have implemented exactly the "interpreted setup and configuration, compiled back-end" design. I decided against this approach.

Experience developing various worm simulations convinces me that NWS users will modify the simulator framework to suit their purposes. I often modified the simulator framework to serve special purposes like counting message receptions, or testing different message delivery algorithms. Writing NWS with a traditional compiled language "back end", and interpreted user code for setup, configuration and worm code burden one set of users, and handicap another. Some users will want to obtain different information from a simulation than the NWS framework provides by default. These users would have to change the "back end" code to collect the information they want while they simultaneously write the simulation in question. These users would have to write simulations in two vastly different programming languages at the same time - the "back end" and the "front end" languages. This two-language design would unnecessarily limit other users who only understand the interpreted user code.

Perl's Advantages

Many people have experience with Perl: Perl implementations exist for many or all modern systems. Lots of web sites and books offer documentation on Perl and how to effectively write Perl programs. This means that the potential audience of users of NWS ends up vastly greater than an alternative simulator design that requires its users to have familiarity with at least two languages.

Perl's Disadvantages

Using Perl for both worms and the simulator library code can make debugging difficult. NWS users may find it difficult to distinguish errors in worm code from errors in library code. Perl programs can use substantial amounts of memory, and they can execute slower than one would like. Users of NWS will never have the ability to model the entire Internet, as did Staniford, Paxson and Weaver^[10].

Deviations from Reality

Any simulation discards elements of the real system it models. This causes simulations to behave differently than real life systems. People using the NWS simulator should realize that their simulations differ from reality in at least these specific ways:

Time passes in discrete steps, rather than continuously.
Communication between network-attached entities happens via message passing, and all message sent in a given time step get delivered in that same time step.
Vulnerabilities are trivial, consisting of nothing more than a flag variable set in an NWS::Software instance.
Simulated networks do not contain performance bottlenecks like routers, firewalls or bridges.
All hosts exhibit the same performance.
Network communications have infinite bandwidth.

I think that this combination of differences from reality prevents one from making any kind of correspondence between time steps in a simulation, and real-world time. It does not prevent using the NWS framework to compare qualitative differences. For example, an NWS user could check relative efficiencies of two different ways to probe the simulated network's address space.

Validation

Before trying anything difficult or interesting I attempted to validate NWS code and behavior.

Class and method level tests

I wrote simple test harnesses to execute as much NWS code as possible, and verify the code's behavior.

The NWS framework code passes all these tests. Users who modify the framework for their own purposes should consider running these test cases (and maybe others) to ensure they haven't broken the framework.

SIS model

SIS simulator code

The Susceptible-Infected-Susceptible model^[22] constitutes one of the simplest mathematical models of an epidemic. Each entity in the population of the model resides in one of two states:

Susceptible
Infected

Entities makes transitions from one state to another according to some probability. Entities participating in biological epidemics can make transitions from susceptible to infected, and possibly from infected to susceptible, depending on the illness. Suffering through the common cold only confers a temporary resistance. Realistic models have to account for this.

If no transitions from Infected to Susceptible states take place, this model approximately matches any number of real life worms: the 1988 DECnet "Father Christmas" worm^[23], 1998 ADMw0rm^[24], the first arrival of Code Red v2^[25], 2002 Slapper worm^[26] (also here^[27]).

Derivation of theoretical analytical solution

To validate the NWS framework, I need some kind of standard against which to measure it. Fortunately, SIS epidemics have an analytical solution. I'll derive most of the solution below in terms of a network worm and a population of exploitable hosts.

I use the following symbols:

`N`	Number of entities in the population
`S`	Address space size
`P`	Probability of randomly picking an address that maps to an entity
`V`	Number of infected entities in population at some time
`V₀`	Number of infected entities in population at the start of the simulation

Assume that the worms pick addresses at random. The probability of picking an address that maps to a host in the population under consideration amounts to:

P = (number of hosts)/(address space size) = N/S.

The fraction of uninfected entities amounts to:

1 - V/N

At any given instant, V number of worms try to infect the remaining uninfected entities. Since the worms have a probability of infecting the remaining uninfected proportion of entities, and only so many uninfected entities remain, the change in the number of infected hosts amounts to:

P(1-V/N)V

A real set of infected hosts operating in the real Internet would send out probes continuously, so I get to magically wave my hands and convert the change in number of infected hosts to a differential equation:

^dV/_dt = P(1-V/N)V

This differential equation, known as the Logistic Equation, turns out to have an exact solution. The solution involves some onerous bookkeeping called integration by partial fractions. ^[28] The exact, analytical solution:

V = V₀/[(1 - ^V₀/_N)e^-Pt + ^V₀/_N]

This equation has some interesting properties. It has time starting at zero and going forwards. It works in terms of the total population, the initial number of infected hosts, and the probability of getting a "hit" on a member of the population.

If the Logistic Equation's solution matches reality, an investigator could reasonably work out when a worm-writer released a worm. One would only have to make a few assumptions about population of exploitable hosts, address space size, and how many worms got released initially.

Staniford, Paxson and Weaver's^[10] "Hit-list Scanning" ends up as nothing more than a large value of V₀ in this form of the solution to the Logistic Equation.

Simulation Description

I used an address space of 65535 (16 bits of address), and I populated it with seven worm NWS::Host instance, and 9993 possible victim NWS::Host instances, for a total of 10000 entities in the population. All entities reside at randomly assigned addresses in the address space. The simulation runs for 135 time steps or until only 1% of the 9993 possible victims remain uninfected. I will clarify the choice of seven initially infected entities below.

Comparison of NWS with analytical solution

The NWS simulation doesn't match the analytical solution particularly well. I believe the difference arises from stepwise treatment of time, and a discrete population. The NWS simulation takes discrete time steps, while the analytical solution has continuous time. Infections happen in whole numbers of entities per time step. The NWS model can only show 1 or 3 or 97 entities infected at a given time step, not 1.237 or 3.532 or 97.8 entities infected at 10.346 hours. NWS has inefficiencies (it takes longer to get to 99% infection) relative to analytical solution for these reasons.

Discrete Euler's Method solution

The analytical solution to the Logistic Equation provides an upper boundary to the NWS SIS simulation. I tried a cheap-and-dirty variant of Euler's Method^[29] of solving differential equations to provide a lower boundary. I made the variant round down to the next lowest integer value of the number of entities infected over a given elapsed time.

Starting with the Logistic Equation:
^dV/_dt = P(1-V/N)V
I'll make the differential ^dV/_dt into a ratio of finite quantities,
^ΔV/_Δt = P(1-V/N)V
If I use a small enough time step (numerical value of Δt) the Euler's method approximation should still remain accurate. Multiply both sides of the equation by Δt, and use int() function to mean "next lowest integer value":
ΔV = int(P(1-V/N)VΔt)

Euler's method has you calculate a value of ΔV based on the current value of V, P and Δt. The algorithm adds ΔV to the current value of V to get the next time step's value of V, an iterative solution:

V_i+1 = V_i + int(P(1-V_i/N)V_iΔt

This necessarily undershoots the analytical solution for two reason. First, rounding down to the next lowest integer value of Δt provides the biggest loss relative to the analytical solution. Second, the iterative solution uses the slope of the function at the start of the time step to shoot the value of the function at the end of the time step. The analytical solution shows us that the curve only goes up, it never goes down, and the curve is concave for the first half of the simulation's run time. Using the slope at the start of the time step under estimates the value of the function at the end of the time step for the concave part of the curve.

The Euler's method solution graphed below started with values of V = 7, N = 65535, P = 10000/65535 = 0.1525 and Δt = 1 which match the NWS simulation. I chose the initial value of V so that the value of ΔV comes out to 1 during the first time step. The values of P and 1-V/N during the first time step mean that any smaller initial value of V will end up with ΔV = 0. This Euler's method simulation breaks down for initial values of V less than 7.

SIS Simulation Results

I ran the SIS simulation 20 times. I reduced the 20 simulation outputs to an arithmetic mean of the number of infected hosts at each time step. The graph below compares the mean of 20 SIS simulation runs with the analytical solution, and the results of running the discrete Euler's method numerical approximation derived above.

SIS Simulation Conclusions

The mean of 20 runs of the NWS SIS simulation reaches 99% infection slower than the analytical solution would have it, and falls reasonably close to the discrete Euler's Method numerical simulation. I take this to mean that the NWS SIS model correctly simulates a network worm turned loose in a population of computers without any resistance.

For small numbers of infected hosts, the real Internet shares the "discrete population" property with an NWS simulation. This seems to imply that an investigator can't use the analytical solution to the Logistic Equation to accurately find the time of an initial infection of a real Internet worm.

A worm author can take away two, opposing conclusions. Start as many worms as possible (a large hit list) to eliminate statistical effects on when a worm population gets traction, or start a single worm to randomly confuse the issue of when the epidemic started.

Cases of Greater Interest

Writing a slow, hard to use simulator that could only model worms that easier, faster methods can adequately model doesn't make sense. This section illustrates a few cases that more or less match real world worms, and don't have an analytical solution available.

CRclean

NWS simulator code
Simple numerical simulation code

Epidemiologists also consider another model of infection: Susceptible-Infected-Recovered^[30]. Observation of biological epidemics like chicken pox or small pox, diseases that confer lifetime immunity on the survivors, motivate this model. In an SIR epidemic, members of the population reside in one of three states:

Susceptible
Infected
Recovered

Members of the population make transitions from one state to another according to some probability. The SIR model limits transitions to

Susceptible to Infected
Infected to Recovered

In computer terms, one might describe the states as vulnerable, exploited and patched.

During the 2001 Code Red worm outbreak, Markus Kern posted code for a passively spreading worm^[31]. I don't have much knowledge about IIS in particular, and Microsoft products in general, but I gather that when a CRclean-infected instance of IIS receives a Code Red URI, CRclean sends a copy of itself to the Code Red-infected-host. The newly-sent-CRclean infects the Code Red-infected-IIS with itself, stops Code Red and patches the previously-infected host to prevent re-infection by Code Red. CRclean then starts watching for receipt of Code Red URIs.

The interaction between IIS, Code Red and CRclean matches an SIR epidemic pretty well. We would consider machines running IIS (but not infected with Code Red or CRclean) as "susceptible". After infection with Code Red, we would consider a machine "infected", and after infection with CRclean, we would consider the machine "recovered".

Derivation of differential equations

Despite asserting above that this is an interesting case, one that doesn't have an analytical solution, I will derive differential equations to describe the IIS/Code Red/CRclean interaction. The differential equations that I derive don't have an analytical solution, I can only solve them numerically. Since numerical solutions to systems of differential equations are notoriously poorly behaved, deriving a system of differential equations for this case that may or may not even converge on an answer doesn't render it trivial, suitable only for validating the NWS framework. Differential equations, solvable or not, also give insight into the problems they describe that simulations do not.

I'll use the following symbols to derive differential equations that describe an epidemic of Code Red among a population of IIS hosts, and automatic patching of the infected hosts by CRclean.

`N`	Number of entities in the population
`P`	Probability of picking an address that maps to an entity
`S`	Number of susceptible (uninfected) entities in population at some time
`I`	Number of infected entities in population at some time
`R`	Number of recovered entities in population at some time

First, the number of susceptible, infected and recovered entities has to sum up to the total number of entities in the population:

N = S + I + R

Next, the differential equation for the rate of change in recovered entities. Infected entities (Code Red worms in this simulation) probe the address space randomly, and they have a probability P of picking an address mapping to an entity. Since I infected entities exist in the simulation at any given time, PI represents the number of probes that hit any entities in the simulation. The number of CRclean entities equates to the number of recovered entities in this simulation. The fraction of recovered entities in the simulation amounts to R/N. so the number of probes from Code Red entities that hit a CRclean entity is PI(^R/_N). Each Code Red probe to a CRclean entity results in one less Code Red entity, PI(^R/_N) amounts to the per-time step change in recovered entities, and -PI(^R/_N) amounts to the per-time step change in the number of infected entities. Since these equations model a continuous system, we can wave a magic wand over it and convert it to a differential equation for representing the rate of change in number of recovered entities:

^dR/_dt = PI(^R/_N)

The change in number of infected entities has a similarity to the SIS model: PI constitutes the number of infectious probes that hit an entity in the address space of the simulation. Since only a fraction of the number of entities that get hit are susceptible, the change in number of infected entities due to conversion of susceptible entities amounts to PI(^S/_N)

This SIR model differs from the SIS model in that some infected entities change to recovered. That change amounts to -PI(^R/_N).

We add the two changes in number of infected entities, and once again invoke the magic of continuous time and a large population to get a differential equation:

^dI/_dt = PI(^S/_N) - PI(^R/_N)

That leaves us with three equations for the three unknowns (S, I, R):

N = S + I + R
^dR/_dt = PI(^R/_N)
^dI/_dt = PI(^S/_N) - PI(^R/_N)

Euler's Method.

I don't think that an analytical solution for the three equations above exists. One can write a numerical simulation for them, however. Assuming differentials constitute ratios of finite quantities and that using small (but not infinitesimal) values of dt approximates the analytical solution, you can write an iterative solution.

R_i+1 = R_i + (PI_iR_i/N)Δt
I_i+1 = I_i + (PI_iS_i/N - PI_iR_i/N)Δt
S_i+1 = N - I_i+1 - R_i+1

The values of S, I and R at a given time step get used to calculate the values for the next time step.

Simulation Description

To compare the NWS simulator with an Euler' method solution for an SIR epidemic, I wrote and ran an NWS simulation of CRclean versus Code Red. I used an address space of 65535 (16 bits), and I populated it with seven NWS::Host instances that run NWS worm code, modeling the start of the Code Red 1 v2 worm. I put in 9986 possible victim NWS::Host instances, modeling IIS hosts, all of them infectable by Code Red v2. All worms and victims reside at randomly assigned addresses.

To make this simulation different from the SIS model above, I put in seven NWS::Host instances that contain code for a simulated CRclean worm. This differs from reality. To my knowledge, no one deployed CRclean.

I used seven each of Code Red 1 v2 and CRclean to give the simulated Code Red worms a good probability of hitting a simulated IIS host during early time steps. This should reduce the differences between the Euler's Method numerical simulation and the NWS simulation.

The Euler's method numerical simulation starts with N = 65535, S = 9986, I = 7 and R = 7. It sets the probability of picking a host at random to P = 10000/65535. It runs for 150 time steps.

The NWS simulation of CRclean matches the Euler's Method numerical solution very closely, far closer that the NWS SIS simulation matches an Euler's method numerical simulation. I believe this results from the fact that a single differential equation can describe an SIS epidemic, while an SIR epidemic requires a system of three differential equations. An NWS simulation of an SIS epidemic "leaks" relative to the Euler's Method solution, in that NWS can't infect fractional numbers of entities. Any time the Euler's Method simulation infects something other than a whole number of entities, at the corresponding time step the NWS simulator only infects the whole number of entities. Any fractional number of infected entities leaks out of the system, since the fractional part gets added to no other quantity. The three equations used to model an SIR epidemic don't "leak" fractional parts of entities out of the system: any fraction of an entity not included in one quantity gets included in another.

CRclean conclusions

It looks like CRclean would have worked against Code Red, despite not doing any scanning on its own. More generally, it looks like a "passive" worm, a worm that merely responds to legitimate requests, instead of actively scanning for vulnerable hosts, would work, provided the population of vulnerable hosts act as clients to the passive worm's service.

Competing Worms

Two Code Red worms^[25] existed and competed for the same set of vulnerable Windows hosts running IIS. For a period in August and September of 2001, Code Red II out-infected Code Red V2. In this section, I use NWS to explore some possibilities of why one worm might out-compete another worm for the same population of infectable entities.

The case of two or more worms that use the same exploit has happened in the real Internet at least twice. Code Red I v2 and Code Red II in 2001, and the many "Slapper" variants^[32] in 2003.

I won't address the different probing strategies used by Code Red I v2 and Code Red II - the NWS framework does not currently allow simulating sub-nets, so simulating Code Red II's preference for probing IP addresses close to its own doesn't make any sense. Some other properties can be addressed.

Variation in probe rates

Simulation code

This set of simulations has an address space of size 65535, in which seven each of two, very similar worms get started. The simulation also contains 10,000 infectable hosts. For verisimilitude, I named the worms "Code Red I v2" and "Code Red II", and I named the susceptible hosts "IIS".

The simulation executes two very similar worms. The worms carry nearly identical code, differing in only two aspects. The "true name" that each worm gives to a newly-infected host differs, one set of worms uses "Code Red I v2", the second set uses "Code Red II". The two worms also differ in that "Code Red II" worms can send an extra probe to a random address some percentage of the time steps in which they execute. The "Code Red II" worm code uses a random number to decide whether to send the extra probe, so the percentage works out only on an aggregate basis - some worm instances may never send the extra probe, some instances may send it every time step. This extra probe simulates a slightly faster worm or a worm that communicates infection slightly faster.

I ran the simulation 20 times each with an extra-probe-percentage of 0, 1, 5 and 10. The following table show the mean number of infected hosts for each of the extra-probe-percentages.

	0%	1%	5%	10%
Code Red I v2	5061.9	4408.95	2726.35	1273.5
Code Red II	4951.1	5604.05	7286.65	8739.5

Even a small increase in number of probes per time step gives a population of worms a big advantage against a slightly slower population.

One worm fixes vulnerability

Simulation code

In this variant, "Code Red II" worm code marks its host as not vulnerable just after exploiting that host. This action prevents re-infection from other "Code Red II" hosts, and conversion to "Code Red 1 v2".

Worms with the true name "Code Red 1 v2" don't perform the extra work of marking the newly-infected NWS::Software instance not vulnerable. Other "Code Red 1 v2" worms can re-infect it, and "Code Red II" can infect it.

Both worms probe at the same rate, once per time step.

The simulation has an address space of 65535. I had the simulation place 10000 victim hosts randomly in the address space. I had the victim hosts contain NWS::Software instances using the identifier and true name of "IIS". At the top end of the address space (address 65525) I had the simulation place an NWS::Host instance containing worm code that upon infection, "fixes" the vulnerability it used to replicate itself. I had this worm use the true name of "Code Red II".

At the bottom end of the address space (address 10) I had the simulation place NWS::Host instance containing worm code identical to the first worm code, except that it doesn't fix the vulnerability. Instances of this worm leave the vulnerability unpatched. I had this worm use the true name of "Code Red I v2".

I ran the simulation 20 times, each time executing 100 time steps, producing the data in the table below.

Worm true name	mean hosts infected per run	median hosts infected per run
Code Red I v2	64.1	35
Code Red II	9888.8	9918

Fixing the vulnerability that initially gave access to the susceptible host confers an enormous advantage on a population of worms competing for a set of susceptible hosts that exhibit a given vulnerability.

Competing Worms Conclusions

For the case of two worms competing for the same population of susceptible hosts, worm authors should strive to write worms as "virulent" as possible.

Worms should patch the vulnerability they used to replicate themselves. It appears that all other things staying the same, having a worm fix the vulnerability it exploited to replicate gives more advantage than a 10% performance boost.

Permutation Scanning worm

Permutation scanner simulator code
Algorithmic permutation code

This simulation models the hypothetical Permutation scanning worm described in How to 0wn the Internet in your spare time^[10], in a section titled "Permutation Scanning". Staniford, Paxson and Weaver wrote about a simulation of a permutation scanning worm, but I believe they missed some key points, developed below.

Permutation scanning worms don't select addresses to probe randomly: they select addresses algorithmically. Each worm has an index, and an algorithm that maps the index value to an address. By incrementing the index, they calculate different, non-sequential addresses to probe. If the worms use a good quality algorithm to calculate address, the addresses generated from successive indexes are quite different: the series of addresses probed looks very close to random. All worms use the same algorithm to generate addresses, but individual worms pick random initial index values at infection time.

A worm that starts late will likely pick an index in a range that an earlier worm has already probed. The late-created-worm will end up probing a series of addresses that some other worm has already probed. To avoid re-probing, worms that try to re-infect previously infected hosts go inactive in the Staniford/Paxson/Weaver scenario. The worm code in the permutation scanning simulation accomplishes this by replying to any re-infection attempts. Individual re-infecting worm instances go inactive when they receive a reply.

One can spoof worms that use this type of scanning. An immune spoofer host could respond that the worm has already infected it, causing permutation scanning worms that probe the spoofer to go inactive. This would protect all hosts further in the permutation than the spoofing host. Real life worms should require two or more consecutive "already infected" replies to go inactive to prevent spoofing. The simulated worm represents the best case of a permutation scanner, with respect to optimism about reinfection.

The permutation scanning worm of this simulation uses an algorithmic permutation to probe address space. A DES-inspired algorithm combines a 32-bit key with a 16-bit index to determine the address to probe. Worms start with a random index, then generate an address to probe by using the permutation function on the index. Worms increase the index sequentially. Hopefully, the series of addresses calculated by permuting sequential indexes look random. All worms use the same permutation by using the same 32-bit key.

The simulation runs in an address space of size 65535 (16 bits). It starts with 7 permutation scanning worm and 9993 victims. The victim NWS::Host instances reside at randomly assigned addresses, as does the single initial worm instance. The simulation runs for 135 time steps or until only 1% of the 9993 possible victims remain uninfected. This allows for comparison to the SIS model random probing worm described above. I ran the permutation scanning worm simulation 20 times, and calculated mean counts of infected hosts at every fifth time step, shown in the graph below.

Permutation scanning worms and random probing worms take exactly the same number of time steps to reach 100% infection.

Permutation scanning produces fewer messages to get to 100% infection than a population of random probing worms does, and a smaller mean number of probes per host. Only active permutation scanning worms send messages, and the proportion of active permutation scanning worms rises, then drops to nearly zero by 100% infection. To contrast, every member a population of random probing worms produces messages all through the course of a simulation.

One area of difference exists: the distribution of probes over the population of entities in the simulation. When you count the probes per host, the population of permutation scanning worms produces a Zipfian distribution - almost 45% of the hosts in the simulation receive 2 probes, abut 26% receive 3 probes, 13% receive 4 probes and so on. The number of hosts receiving a certain number of probes amounts to almost exactly half the number of hosts that receive one more probe. The typical host receives far fewer scans for the optimistic permutation scanning worm, than for a random scanning worm.

The random probing worms produce a Poisson distribution of probes per host, exactly what you'd expect for a random process.

The property of not probing most hosts very often might make discerning a permutation scanning worm more difficult,

Random probes and permutation probes histogram

Permutation scanning simulation conclusions

I conclude that permutation scanning doesn't give a worm an overall advantage relative to random probing. A permutation scanning worm doesn't have any speed to 100%-infected advantage. It has a lot of disadvantages. I found it difficult to write an algorithmic permutation of a 16-bit address space that gives a random look to the permuted indexes. Depending on the vulnerability and the communication protocol involved, firewalls or NAT can prevent infected hosts from responding to repeat infection attempts, negating any advantage conferred by lowering traffic - worms that should go inactive instead keep probing. Permutation scanning has trouble with NAT-ed subnets, too. Since a given set of worms all use the same permutation, a worm instance that obtains access to a NAT-ed subnet host will almost certainly skip over most or all of the other hosts on the NAT-ed subnet, at least until an address in the subnet turns up as a permuted index. I conclude that the Internet will probably never see a permutation scanning worm that becomes epidemic.

Cheese vs 1i0n

The 1i0n^[33] Linux worm of 2001 prompted someone to write an anti-worm: Cheese^[34]. Beyond obvious differences like CRclean attacking IIS and Cheese exploiting a back door TCP server left behind by the 1i0n worm, significant differences exist between them. One significant difference between Cheese and CRclean consists of the method of finding hosts infected with the original worm. Cheese actively sought out back door TCP servers left running by 1i0n worms. CRclean waits for a Code Red URI to arrive, and attempts to infect the sender of the Code Red URI. Does active searching for hosts infected by some other worm make a difference?

The same differential equations as describe the CRclean model apply here. The derivation of the differential equations ends up with different semantics for the change in recovered entities equation. The Cheese worm actively probed for 1i0n back door servers. The probability of a Cheese worm probing an address that maps to a host remains P, but the number of Cheese worms is I, and the fraction of the population infected with 1i0n is ^I/_N. For Cheese, the differential equation for change in number of recovered entities is:

^dR/_dt = PR(^I/_N)

The PRI/N term equates algebraically to the PIR/N in the CRclean system of differential equations. The same numerical simulation can describe both CRclean/Code Red/IIS interaction and the Cheese/1i0n/BIND interaction. The differential equations give the insight that CRclean and Cheese should observe fundamentally similar limits on spreading.

Simulation Description

I used an address space of 65535 (16 bits), and I populated it with seven worm NWS::Host instances containing code representing the initial number of 1i0n worms. I also populated the address space with seven NWS::Host instances containing code representing an initial number of Cheese worms. I put in 9986 possible victim NWS::Host instances modeling hosts running BIND, the software exploited by 1i0n worms. All NWS::Host instances reside at randomly assigned addresses in the simulation's address space.

For validation purposes, the simulation runs for 150 time steps and all Cheese worms and 1i0n worms get their start at time step 1.

Cheese worm simulation compared with Euler's method

This Cheese worm simulation matches the Euler's method numerical solution about as well as the CRclean worm simulation did.

Network Traffic Comparison

The set of Cheese worms send out many more probes than the set of CRclean worms. CRclean worms only respond to Code Red probes, while Cheese worms actively probe for 1i0n back door software. The set of CRclean worms produces fewer and fewer response messages as the number of Code Red worms decreases, while the set of Cheese worms produce more and more probe messages as their population increases.

Cheese Simulation Conclusions

If you must write an anti-worm, write it in the "passive" style of CRclean, rather than the "active" style of Cheese. Passive worms produce less network traffic of their own.

msblast - effect of banded address space

msblast simulator code
Random probing worm simulator code

The msblast^[35] worm appeared in August of 2003. It took advantage of a buffer overflow in an RPC system that apparently existed in nearly every Windows NT-based operating system.

The msblast worm had an unusual method^[36] for picking addresses to probe for susceptible hosts: each worm picked a random IP address upon infecting a host, then sent probes sequentially up from the random address.

Given that network administrators have allocated addresses inside the Internet Protocol address space in bands^[37], perhaps the author of msblast thought that sequential probing would allow faster infection rates. I wrote an NWS simulation to test msblast's probing strategy against victim hosts placed in bands in the address space, and against victim hosts placed randomly throughout the address space.

Simulation Description

The real question amounts to "what difference does an address probing strategy make if the addresses of susceptible hosts are arranged in groups rather than scattered randomly?" To test this, I wrote 4 different simulations:

Random address probing worm, victims placed randomly
Random address probing worm, victims placed in bands
msblast-style probing worm, victims placed randomly
msblast-style probing worm, victims placed in bands

In each of the four simulations, I used an address space of 65535 (16 bits of address), and I populated it with seven worm NWS::Host instance, and 9993 possible victim NWS::Host instances, for a total of 10000 entities in the population. In all four simulations, the initial seven worm entities get randomly selected addresses.

In two of the simulations, the victims also get randomly selected addresses. In the other two simulations, the victims get placed sequentially in nine "bands", all bands one-ninth of the address space apart. Each band contains one-ninth of the victim hosts.

For one simulation each of random and banded placement of victim hosts, I modeled a random address probing worm, like Code Red v2, or Slapper. Each time it gets executed, a random address probing worm picks an address in the simulation's address space at random. It tries to infect that address.

In the second set of two simulations (banded placement of victims and random placement of victims), I modeled the msblast worm. The msblast worm code picks an address at random when it infects a previously susceptible host. Each time the worm code executes after that, it probes the address, then increments the address. Written an alternate way, it probes sequentially up the address space. I had the code wrap around to address 1 if and when it gets to the maximum address in the simulation.

The four simulations run for 135 time steps or until only 1% of the 9993 possible victims remain uninfected.

I ran the simulations 20 times each. I reduced the 20 simulation outputs to an arithmetic mean of the number of infected hosts at each time step. The graph below compares the mean of 20 simulation runs for each of the four simulations

msblast probing conclusions

A worm that probes an address space randomly reaches 100% infection just as fast for both victim host layouts, random and banded. The msblast-style probing worm performed differently for the two victim host layouts.

For a set of victims with randomly-allocated addresses, the random probing worm dramatically outperforms msblast-style sequential probing.

For a set of victims in nine bands, the random probing worm performs identically to the random probing worm with a set of randomly positioned victims. The msblast-style sequential probing performed almost as well.

I hypothesize that the author of the real msblast had access to a network with sequentially-allocated addresses for testing. The sequential probing msblast worm worked fairly well on the network used for testing, maybe even better than random probing, so the msblast author chose an address selection algorithm that didn't work very well in the real Internet.

I made the choice of nine bands fairly arbitrarily. Some possibility exists that more or fewer bands, or even a more densely-populated address space make the msblast address selection algorithm more efficient than random probing.

Programming guide

Installation

I assume that users of NWS will make modifications to the library modules that implement the NWS framework. It seems useless to install NWS as an official Perl module when the users will almost certainly have customized copies of the NWS modules in the directories of specific simulations.

To install NWS, you create a directory into which you unpack the distribution with a command similar to gunzip -c nws.tar.gz | tar xf -. All the example simulations from this document unpack as well. Every example simulation gains the use of the NWS framework by including four Perl "use" directives:


use NWS::Network;
use NWS::Message;
use NWS::Host;
use NWS::Software;

If you create your own simulation, include the same four Perl "use" statements.

This boils down to having a directory named NWS located in the directory in which simulation code resides. The NWS subdirectory must contain the four Perl modules (Network.pm, Message.pm, Host.pm, Software.pm) that implement the NWS framework.

Simple Worm Example

Simple worm code

An example of a very simple worm will help the reader understand the NWS simulator. The following 28 lines of Perl illustrate how to write a simulator for a simple Susceptible-Infected-Susceptible model.


     1	#!/usr/bin/perl
     2	use NWS::Network;
     3	use NWS::Message;
     4	use NWS::Host;
     5	use NWS::Software;
     6	
     7	my $worm_code = q{
     8		my ($host, $software, $phase) = @_;
     9		$host->SendMsg(
    10			Message->new(
    11				$host->RandAddress,
    12				'OS-CPU',
    13				'Victim',
    14				$software->{code})
    15		) if $phase == $Host::Run;
    16		$software->{true_name} = 'Worm' if $phase == $Host::Init2;
    17	};
    18	
    19	my $network = Network->new(65535, 0);
    20	$network->AddHost(Host->new('OS-CPU', 0, Software->new('Worm', $worm_code, 0)));
    21	$network->AddHost(Host->new('OS-CPU', 0, Software->new('Victim', '', 1)))
    22		for (1..10000);
    23	
    24	for (my $i = 1; $i < 135; $i += 5) {
    25		$network->Run($i, 5)->PrintCounts;
    26	}
    27	
    28	exit 0;

lines 7-17 show that a working worm doesn't need a lot of code. Note that the worm code does two different things, attempting to replicate itself at a random network address if called with the $Host::Run code, and setting the NWS::Software's "true name" to "Worm" if called with the $Host::Init2 code.
line 8 shows what arguments get passed to worm code when it executes: the instances of NWS::Host and NWS::Software that provide a context for the code to execute in, and a numeric indication of what "phase" the worm got called during.
line 11 sets the destination address of the NWS::Message object created on lines 10-14. The Host::RandAddress method returns a random address in the address space of the NWS::Network containing the Host instance.
line 12 sets the Message object's identifier instance variable to the value "OS-CPU". Host instances will ignore Message objects sent to them that do not possess an identifier that matches their own identifier.
line 13 sets the Message object's software instance variable to "Victim". When a Host instance accepts a Message based on matching identifier instance variable values, the Host tries to find an instance of Software based on the value of the Message's software instance variable.
line 16 sets the "true name" of the newly infected NWS::Software instance to "Aworm". Every instance of class NWS::Software has an identifier and a true name. This allows the worm to infect Software instances based on identifier, and it allows a simulation to report on infections by true name. It also illustrates a difficult real world problem: how does one name what Code Red does to an IIS process? A Code-Red-infected-IIS still contains (if not executes) IIS code.
line 19 creates and initializes the NWS::Network object. The NWS::Network object connects all the instances of class NWS::Host that take part in the simulation, and does the work of passing messages between NWS::Host instances. This simulation gives this Network object a 16-bit address space, and lets the NWS::Network code choose a seed value for the random number generator.
lines 20-22 illustrate how to stock an NWS::Network object with NWS::Host objects. The 9999 "victim" Host instances each contain an instance of class Software with a name that the worm code knows. The "victim" software gets marked vulnerable.
lines 24-26 run the simulator for 135 timestamps, printing counts of the NWS::Software instances contained in the simulation's collection of NWS::Host instances every 5 time steps. The simulation calls NWS::Network::PrintCounts method to do the output.

Simple worm output

When invoked from a Linux command line prompt, the simple worm simulator produces output that looks like this:


 4:16PM 642 % ./simple.pl
#step, msg count,       Victim, Worm
5       9       9999    2
#step, msg count,       Victim, Worm
10      10      9999    2
#step, msg count,       Victim, Worm
15      15      9998    3
#step, msg count,       Victim, Worm
20      23      9994    7
#step, msg count,       Victim, Worm
25      52      9989    12

The NWS::Network::PrintCounts method produces two lines of output each time that a program calls it. The first line begins with the '#' (pound sign) character, and lists the last time step number, the number of messages sent since the last call to NWS::Network::PrintCounts, followed by true names of the NWS::Software instances in the simulation.

The second line contains a tab-separated list of numbers, counts of the same true names of NWS::Software instances that appear on the first line.

This output format served me well, allowing interactive and batch graphing using Gnuplot. It may not serve others as well. NWS::Network::PrintCounts does not use any information other than that contained in NWS::Network instance variables, so any NWS user can re-write PrintCounts to suit him or her self.

Classes

NWS::Network

Instances of NWS::Network model all the wiring and routers and gateways that compose a single real network.

A program using the NWS simulation framework should only bother to create a single instance of class NWS::Network. The instance of Network contains and connects instances of class Host. The Network instance calls NWS::Host::RunProcess method on each contained instance of NWS::Host, which can cause a NWS::Host to send a NWS::Message instance to some other NWS::Host contained by the NWS::Network instance. NWS::Network instances deliver NWS::Message instances around the simulated network.

NWS::Network Methods

Constructor: new NWS::Network $address_space $seed
You must pass in the size of the address space you want to use. The value of the $address_space argument should amount to the largest address in the simulation. If you don't pass in a seed for the random number generator, The NWS::Network constructor will choose one for you. Passing in a particular seed allows you to re-run a simulation to see if you get identical results.
Example use:
```
	my $network = new NWS::Network 10000;
```
The above Perl fragment creates an NWS::Network instance that allows a maximum address of 10000, and seeds the simulation's random number generator with a value of its own making.
GetSeed: returns the number used as the seed of the random number generator. Example use:
```
	print "Simulation's PRNG seed: ", $network->GetSeed, "\n";
```
AddHost: add an NWS::Host entity to the network. Gives the Host an address if it doesn't already have one (NWS::Host::address evaluates to 0 or undefined). Will not put a Host at address zero. Will not give two Host instances the same address if it picks the addresses. By way of an example:
```
	my $host = new Host 'Solaris-SPARC', 0;
	$host->AddSoftware(new Software, 'snmpXdmid', '', 1);
	my $address = $network->AddHost($host);
	print "Host added at address ", $address, "\n";
```
The example creates an NWS::Host instance with vulnerable software, and an address of zero. Inserting $host into the NWS::Network will cause $network to select a random address in its address space for $host. AddHost returns the numerical address at which it placed the Host. This allows the NWS user to place victims randomly in a simulation's address space, and still build up a "hit list" for the initial worm instance.
Run: execute the simulation. Call this on an NWS::Network instance after you've added hosts and performed any other setup. Run takes two scalar arguments: the time step it should start counting from, and a number of time steps to execute before returning. The internal state of any give Network instance stays constant between calls to Run unless a user's simulation code changes it. Numbering time steps is just a convenience to the user. Returns a reference to the NWS::Network instance on which you call it.
```
	for (my $i = 1; $i < 100; $i += 5 ) {
		$network->Run($i, 5)->PrintCounts;
	}
```
The above code fragment executes 100 time steps on the contents of $network, 5 steps at a time. After each 5 time steps, the example executes PrintCounts on $network. If you only care about end results of a simulation, you can have all time steps executed without interruption:
```
	$network->Run(1, 100);
```
PrintCounts: Each call to this method prints two newline-terminated lines of text to STDOUT. The first line begins with an octothorpe ('#') character, and contains tab-separated column names. The second line contains the numeric values of the columns named in the first line. Each second line contains time step number and a count of all Message objects sent since the last call to PrintCounts. Each second line also contains a count of the number of Software instances present in the Network when PrintCounts executes. The Software instance counts go by the value of the true_name instance variable.

Letting the AddHost method prefer to place entities at random addresses in a simulation's address space may constitute a flaw. This preference may lead NWS users to assume that the real Internet, or even a corporate intranet, has a randomly-addressed population. Simulations that don't match reality closely may result.

NWS::Network Instance variables

hosts: a hash of NWS::Host instances, keyed by numeric address.


	foreach my $h (values %{$network->{hosts}}) {
		print "Host at", $h->{address}, " received ",
			$h->{msg_cnt}->{rcvd}, " total probes\n";
	}

The above example enumerates all NWS::Host instances contained in $network.

total_msg_count: a scalar count of how many NWS::Message objects have been sent so far.
total_hit_count: a scalar count of how many NWS::Message objects have been delivered to NWS::Host instances. Use Host::msg_cnt instance variable to obtain a per-host count of messages that got delivered and learn how the host accepted them.

The last two instance variables mainly have use in determining if a given simulation correctly sends and receives enough messages. For example, over a large enough number of time steps, randomly addressed messages should hit enough hosts to make the ratio of total_hit_count to total_msg_count equal the ratio of number of hosts to address space size.

NWS::Host

Instances of class NWS::Host represent computers attached to some form of a network. Each NWS::Host object has a numerical address associated with it by the NWS::Network object in a simulation, and can contain zero or more NWS::Software instances.

NWS::Host Methods

Constructor: creating a new Host requires two arguments: a string identifying the "type" of the host, like "Linux-x86" or "Solaris-SPARC", and a number, the desired address for the newly constructed Host when you place it in a Network via NWS::Network::AddHost.
```
	my $susceptible = new Host "OS-CPU", 0;
	$network->AddHost($susceptible);
```
Since the above code fragment constructs Host instances with an address of zero, the Network object places the Host instances randomly in its address space.
AddSoftware: put NWS::Software into a Host instance. Used in setup of simulations, but also useful in worm code, to install "back door" software as so many real life worms do.
```
	my $wormhost = new Host "Linux-x86", 0;  # network assigns address
	my $sw = new Software "ADMworm", $worm_code, 0;
	$wormhost->AddSoftware($sw);  # host infected at simulation start
```
NWS::Host only allows one Software instance of a particular identifier. Since the identifier value gets used like TCP or UDP port numbers, this limitation shouldn't cause too much difficulty.
RemoveSoftware: remove a specific instance of NWS::Software from a Host instance.
```
	# Remove the offending exploited software
	$host->RemoveSoftware('bind');
```
Note that the string passed to RemoveSoftware removes based on value of identifier instance variable of the Software, not on the true_name.
RandAddress: returns a scalar, a number somewhere in the simulation's address space. Useful to random probing worms:
```
	$host->SendMsg(
		new Message
			$host->RandAddress,
			"Windows",
			"IIS",
			$software->{function}
	);
```
The NWS::Host::RandAddress method actually calls a method NWS::Network::RandAddress, because Network objects actually know the size of the address space. You may wish to override NWS::Host::RandAddress with a different programmable random number generator (PRNG) if you disbelieve in Perl's built in PRNG, or you want to simulate a worm that uses a buggy or poorly-designed PRNG, like Code Red I v1.
TimeStep: returns a scalar, the numerical value of the current time step. Might have use to when simulating something like Code Red going inactive the 20th of every month.
SendMsg: initiate delivery of an NWS::Message object to an address in the simulation. Message instances sent to an empty address get silently discarded. No notion of successful or unsuccessful delivery gets returned to the sender.
```
	$host->SendMsg(
		Message->new(
			67, 'OS-CPU', 'Exploited',
			$software->{function}
		)
	);
```
The above code fragment sends a Message to address 67. Should NWS find a Host at address 67, NWS follows the process described below in RecvMsg to either ignore the Message, pass the Message to a Perl function (worm code), or exploit a Software instance in the host at address 67.
RecvMsg: method called by Network instance that contains the Host. Causes the Host to decide whether to accept the message (message identifier matches the host identifier), deliver the message to some specific Software instance (value of message's software instance variable matches the name of some Software instance in the host), exploit the Software (matching Software instance has a true value in its exploitable instance variable) or merely deliver the message to the matching Software.
RunProcesses: called by the NWS::Network instance that contains the Host. Causing all executable functions contained by Software instances in the Host to get run once. Called from NWS::Network::Run once each time step.

NWS::Host Instance variables

identifier: a string that allows simulations to distinguish between varieties of Host instances. The value of identifier determines which messages (NWS::Message instances) the Host will accept. NWS::Message instances contain an identifier that must match the value of Host instance variable identifier before any further processing gets done. The simulations in this document used a single identifier, but simulations of heterogeneous networks would use several.
address: a scalar, the numerical value used to locate Host instances in a network. Set by the NWS::Host constructor, but a zero value from construction lets the NWS::Network instance that a Host gets assigned to pick the address at random.
internet: a reference to the NWS::Network instance containing the Host. Allows worm code to get access to any member functions or instance variables of NWS::Network.
msg_cnt hash containing four keys:
- rcvd: the number of NWS::Message instances delivered to this host.
- matched: the number of NWS::Message instances that matched this host's identifier value. Less than or equal to the value of rcvd
- accepted: the number of NWS::Message instances delivered to this host that got delivered to an instance of NWS::Software contained in the host. Less than or equal to the value of matched.
- exploited: the number of NWS::Message instances delivered to this host that exploited a vulnerability in the NWS::Software that accepted the messages. Less than or equal to the value of accepted.
msg_seq_no: A count of how many Message objects this Host has sent out. msg_seq_no gets incremented and used as the value of a Message object's sequence instance variable.

Using msg_cnt correctly can allow you to keep track of the efficiency of a given address probing scheme.

NWS::Software

Instances of class NWS::Host can contain instances of class NWS::Software. NWS::Software instances have a name, so that an NWS::Host can contain more than one instance of NWS::Software. This situation models real computers, which often run hundreds of different processes at any given instant.

Instances of NWS::Software can contain actual code - fragments of Perl code that get actually compiled and executed when the simulation's NWS::Network tells the NWS::Host objects to run their processes.

NWS::Software Methods

NWS::Software only has a constructor method. Creating a new instance of Software requires three arguments: a string naming the Software, a string containing a fragment of Perl code, and a number 0 or 1.

Example NWS::Software constructor use:


	my $annunciator_code = qw{
		my ($host, $software, $phase, $msg) = @_;
		print "Host at ", $host->{address}, " called during phase ",
			$phase, "\n";
	};
	my $sw = Software->new(
		"Call phase Annunciator",
		$annunciator_code,
		0
	);

The string naming the Software gets used to decide whether or not to accept a given NWS::Message object sent to a Host. The fragment of Perl code actually gets compiled (via an eval BLOCK; construct) and runs during simulation execution. The number indicates whether the Software is vulnerable (1) or not (0).

The constructor method compiles any Perl fragment passed as an argument, and it sets identifier and true_name instance variables to the value of the first argument of the constructor. After that, the constructor calls the compiled Perl fragment with the $Host::Init phase argument. Worm code should change the true_name value at this point.

NWS::Software Instance variables

identifier: a scalar, a string used by Host instances to keep track of the Software instances they contain. Also used to discriminate between Message instances received by a Host: a Message has an instance variable that has to lexically equate to a Software instance's identifier. In this respect, the Software instance's identifier acts something like a TCP or UDP port number. It serves to direct message traffic to the intended process on the addressed host. All the examples in this document use a phrase like "Victim" or "IIS", but simulations could use digit-strings for identifiers, too, increasing the verisimilitude of the simulation.
function: a reference to a closure created at construction from the Perl fragment passed as a string argument to the constructor. This is the "worm code" that this document frequently mentions. Called as a function at specific times during a simulation's execution.
exploitable: a scalar, used as a boolean to decide whether or not a particular Software instance should be exploited by a just-received Message, or whether the Message should be passed to the Software instance's function. That is, Software instances have to be marked not exploitable before they receive messages with a phase of $Host::Recv.
true_name: a string. This string doesn't get used any where except in the NWS::Network::PrintCounts method, and it has no real life analog. The lack of real life analogy points out a real life difficulty: exactly how does one tell what a running process is? Infection of SQLServer by the "slammer" worm didn't change the process ID of the infected process. true_name exists so that an NWS user can write worms that by convention set true_name to indicate infection. Without a distinguishing characteristic like true_name, NWS users would have to look at counts of messages flowing through the system to decide what a worm population was doing, just as real life network administrators do.

NWS::Message

Instances of class NWS::Message model frames or packets, and constitute the hunks of information that NWS::Host objects can pass between each other. The NWS simulations act as if the Internet mainly communicated using UDP/IP instead of TCP/IP.

NWS::Message Methods

NWS::Message only has a constructor. Arguments to the constructor set contents of all the instance variables that a user is responsible for:


	my $probe = new Message
		$host->RandAddress,    # Destination address
		"Windows-x86",         # Host identifier
		"IIS",                 # Software identifier
		$software->{function}; # Worm code

NWS::Message Instance variables

destination: a number, the address to which some entity sent the Message. Equates numerically to the recipient Host's address. Set by the user via the constructor method.
source: a number, the address of the entity that sent the Message. Set in NWS::Host::SendMsg Useful to send Message's back to an attacking worm.
identifier: a string. Has to lexically equate to a Host's identifier before the Host will accept the message. Allows simulating heterogenous networks. Set by the user via the constructor method.
software: a string. The name of the Software for which the Message is intended. Analogous to TCP or UDP port number. All examples in this document use names like "IIS" or "BIND", but "80" or "53" or "135" would work just as well. Set by the user via the constructor method.
code: function pointer. Used to transport "worm code" from Host to Host. Set by the user via the constructor method.
sequence: a number, increasing from 1 for each Message a given Host sends. Set in NWS::Host::SendMsg

An entity sending an NWS::Message only has to fill in destination, identifier and software for the Message to pass through the delivery algorithm.

The code instance variable is the means by which a worm passes executable content to another entity in the simulation. Usually, a programmer uses the NWS::Software::function contents to fill in code, but this isn't enforced.

Worm Code

Users of the NWS simulator see worm code as text, more specifically as fragments of the Perl programming language. At runtime, an NWS simulation creates instances of class NWS::Software. If a programmer passes a Perl code fragment to the constructor of NWS::Software, the code fragment actually gets compiled via an eval statement. The instance of NWS::Software so constructed carries around a reference to the compiled code fragment, analogous to a function pointer in other programming languages.

Example Worm Code

The following 26-line fragment of Perl constitutes a fairly sophisticated worm. When executed in an NWS simulation, it sends infectious messages to random addresses. If it receives an infectious message from another worm instance, it replies back. If it receives a reply to an infectious message, it marks itself inactive, and sends no further messages.


     1	my $worm_code = q{
     2		my ($h, $sw, $c, $m) = @_;
     3		if ($c == $Host::Run and $sw->{active}) {
     4			$h->SendMsg(
     5				new Message
     6					$h->RandAddress,
     7					"OS-CPU",
     8					"Victim",
     9					$sw->{function}
    10			);
    11		} elsif ($c == $Host::Recv) {
    12			if (!$m->{already_infected}) {
    13				$m->{destination} = $m->{source};
    14				$m->{already_infected} = $h->{address};
    15				$m->{code} = 0;
    16				$h->SendMsg($m);
    17			} else {
    18				$sw->{true_name} = 'Worm - inactive';
    19				$sw->{active} = 0;
    20			}
    21		} elsif ($c == $Host::Init2 or $c == $Host::Init) {
    22			$sw->{true_name} = 'Worm - active';
    23			$sw->{exploitable} = 0;
    24			$sw->{active} = 1;
    25		}
    26	};

line 2 illustrates the worm code calling convention
lines 3-10 perform the work of infecting susceptible hosts. The worm probes randomly selected addresses, and sends its own code out, via the value of $sw->{function}.
lines 11-20 perform one of two different actions upon receipt of a Message:
1. lines 13-16 reply to an infectious message from some other instance of the worm to let that other instance know it probed a worm, not a susceptible host or an empty address. This code adds an instance variable named already_infected to the NWS::Message instance before sending it back.
2. lines 18, 19 react to the receipt of a message that some other worm sent using lines 13-16. Lines 18 and 19 get executed if the Message referenced by $m has an instance variable named already_infected that has a value. Line 19 sets the active instance variable to 0, which means that the test on line 3 fails any subsequent times the worm code gets called with a $c value of $Host::Run. After executing line 19, the worm no longer probes for susceptible hosts.
lines 21-24 act during the two initialization phases: upon compilation during Software instance construction, and upon exploiting a vulnerable host. This piece of code sets the true_name instance variable, marks the Software as not vulnerable, and takes advantage of the Perl feature of dynamically adding instance variables to create and set the active instance variable.

If there is a trick to writing worm code, it's to realize that you have to write code that's not only thread-safe, but also re-entrant.

When worm code runs

The code contained by instances of class NWS::Software gets executed in four different situations.

Just after an NWS::Host adds the software, in method NWS::Host::AddSoftware
Once each time step during normal network operation. An NWS::Network instance calls NWS::Host::RunProcess on each NWS::Host instance it contains at every time step.
When a NWS::Message object makes its way from one NWS::Host to another. If the receiving NWS::Host finds the message appropriate to its "operating system" and "CPU" it passes the incoming message to the appropriately named NWS::Software instance. The NWS::Software's code gets executed, and the NWS::Message instance gets passed to the code.
Just after exploiting an NWS::Software instance.

NWS::Software instances execute the code once on construction to allow the code to initialize run-time constants rather than recalculating the constants at every execution. Similarly, an NWS::Host causes the execution of any code an exploiting NWS::Message might contain just after the exploitation. This allows worm code to set up and initialize. Running code in NWS::Software instances that receive a NWS::Message allows NWS::Hosts to cooperate, or it allows writing worms like CRclean, that style "strike backs" when they receive a particular NWS::Message.

Code for a worm doesn't have to execute anything in any of the four situations. In the simple worm example section, line 15 constitutes the test for execution of the worm code during normal network operation. Line 16 tests for execution just after exploitation. The simple worm code in the example gets executed in the two other situations (on message reception and at installation), but does nothing.

Worm code calling convention

When an NWS simulation executes the NWS::Network::Run method, any compiled code fragments get called. The arguments to the compiled code fragment look like this:

NWS::Host instance currently causing the code fragment to execute
NWS::Software instance that contains the code fragment
A numerical code indicating the "phase" of execution. The four phases symbolically:
- $Host::Init: indicates that the code runs just after NWS::Software compiled it with an eval.
- $Host::Init2: indicates that the code runs just after it has infected a new host.
- $Host::Run: indicates that the code runs as its normal, once-per-time-step turn.
- $Host::Recv: indicates that the code has received a Message object from some other entity in the simulation.
For phases Host::Recv and Host::Init2, the relevant NWS::Message object

A Message object doesn't always appear in the worm code's argument list. When a Message appears, the phase of execution determines what the Message contains. If the phase argument equates to $Host::Init2, the Message is the infectious communication that caused a new replicated worm instance to start. If the phase argument equates to $Host::Recv, the Message argument came from a worm instance executing in some other Host in the network.

Worm code history

I believe that recounting what I did to arrive at the current arrangement of worm code and its calling conventions would help the reader to understand how to write better worm code. Initially, I had every NWS::Software instance carrying around worm code as a string. Every time a NWS::Software instance got called upon to execute its code, an eval of the string took place. I did this because I didn't have a good idea of what information (including what objects) worm code would need access to in order to act realistically. This technique worked, but it did cost a lot of execution time to eval a string in every worm during every time step.

My second effort involved having NWS::Software instances carry around worm code as a string, and additionally carry the worm code around as a pointer to a closure (a dynamically compiled anonymous function). Well-known names for information important to a worm's execution ($host, $software) existed when the code-as-string got compiled to a closure. Worms would send the code-as-string to potential targets. Upon exploitation, targets compiled the code-as-string to their own private closures. This provided a fair performance increase, since eval didn't happen for every worm for every time step, and it also narrowed the information available to any given worm to well-known variable names present when strings got evaled to closures.

To allow for simulation of different worm behavior, I had added arguments to calls to the worm code: a number indicating the "phase" of execution, and the message just received. I noticed that only a few well-know variables got used in even the most elaborate simulations: references to the NWS::Host and NWS::Software that contained the worm code.

I also realized that the simulations spent large amounts of time compiling the worm code to a closure ever time a worm exploited a vulnerability. This caused me to change from compiling a closure for every infected NWS::Software instance to compiling a single closure. Instead of passing worm-code-as-text, the NWS::Message instances now contain a pointer to the appropriate closure. Instead of having well-known variables available to the closure at compile time, I changed the worm code calling convention to include exactly those well-known variables as arguments. This brought down execution time and overall memory usage of any given simulation.

Further Work

Gatewayed network simulation

Simulate Network Address Translation gateways and firewalls by including code in class NWS::Network that allows one instance of NWS::Network to contain a second instance of NWS::Network. The second instance should pass messages appropriately by translating addresses, and contain a second set of NWS::Host instances.

One could use such an enhanced NWS::Network class to simulate Code Red II's spread behind firewalls and gateways.

Rate limited devices

The current NWS simulation framework includes nothing with a performance problem. Real life networks have devices like firewalls or routers or specific software that suffers during a network worm epidemic. Rate limited devices or software end up getting bogged down by worm-related traffic, and either take a long time to deal with legitimate traffic, or drop legitimate traffic entirely.

Adding rate-limiting bottlenecks to a network simulation would allow the use of NWS as more than a worm design tool. Users could develop a worm and a network simulation without bottlenecks, than add or turn on bottleneck objects to see where the worm causes network failures.

Rate limiting could take the form of a limit on how many NWS::Message instances can pass in a time step, whether the remaining messages get queued or discard, or it could take the form of shutting down if the number of Messages to pass gets too large.

A combination of rate limited devices and gatewayed networks could help to confirm Cliff C Zou's two-parameter Code Red model ^[12].

Bibliography

[1] The Jargon File
Eric S. Raymond
http://www.catb.org/~esr/jargon/html/W/worm.html
[2] The ELF Virus Writing HOWTO, section 1.1
Alexander Bartolich, 2003-02-15
http://www.linuxsecurity.com/resource_files/documentation/virus-writing-HOWTO/_html/intro.html#what.exactly.is.a.virus
[3] Email Virus Propagation Modeling and Analysis
Cliff C. Zou, Down Towsley, Weibo Gong
University of Massachusetts, Amherst, Technical Report TR-CSE-03-04
http://tennis.ecs.umass.edu/~czou/research/emailvirus-techreport.pdf
[4] The Jargon File
Eric S. Raymond
http://www.jargon.net/jargonfile/f/forkbomb.html
[5] Cheese beats crackers
Mark Ward, Tuesday, 22 May, 2001, 12:04 GMT 13:04 UK
http://news.bbc.co.uk/1/hi/sci/tech/1344344.stm
[6] The Case for Beneficial Computer Viruses and Worms
Greg Moorer
23rd National Information Systems Security Conference Proceeding
October 16-19, 2000
http://csrc.ncsl.nist.gov/nissc/2000/proceedings/papers/601.pdf
[7] Whitehat 'Max Vision' gets 18 months
Kevin Poulsen
The Register, http://www.theregister.co.uk/
22/05/2001 at 19:01 GMT
http://www.theregister.co.uk/content/archive/19132.html
[8] Slashdot, News for Nerds. Stuff that matters.
http://slashdot.org/
[9] Netcraft: Web Server Survey Archives
As of August 2003, the Apache web server ran on 63.98% of sites surveyed. Microsoft's IIS ran on 23.75%
http://news.netcraft.com/archives/web_server_survey.html
[10] How to 0wn the Internet in Your Spare Time
Stuart Staniford, Vern Paxson, Nicholas Weaver
Proceedings of the 11th USENIX Security Symposium (Security '02)
http://www.usenix.org/publications/library/proceedings/sec02/
August 5-9, 2002
http://www.icir.org/vern/papers/cdc-usenix-sec02/
[11] Internet Quarantine: Requirements for Containing Self-Propagating Code
David Moore, Colleen Shannon, Geoffrey M. Voelker, Sefan Savage
Proceedings of the 2003 IEEE Infocom Conference
April, 2003
http://www.cs.ucsd.edu/users/savage/papers/Infocom03.pdf
[12] Code Red Worm Propagation Modeling and Analysis
Cliff Changchun Zou, Weibo Gong, Don Towsley
9th ACM Conference on Computer and Communication Security (CCS'02)
November 18-22, 2002
http://tennis.ecs.umass.edu/~czou/research/codered.pdf
[13] SSF.App.Worm A Network Worm Modeling Package for SSFNet
Michael Liljenstam
http://www.cs.dartmouth.edu/~mili/research/ssf/worm/
[14] Scalable Simulation Framework
http://www.ssfnet.org/homePage.html
[15] Directed-Graph Epidemiological Models of Computer Viruses
Jeffrey O. Kephart and Steve R. White
Proceedings of the 1991 IEEE Computer Society Symposium on Research in Security and Privacy
May 20-22, 1991
http://www.research.ibm.com/antivirus/SciPapers/Kephart/VIRIEEE/virieee.gopher.html
[16] GNU GENERAL PUBLIC LICENSE, version 2
June 1991
http://www.gnu.org/licenses/gpl.txt
[17] Shining Light on Dark Address Space
Craig Labovitz, Abha Ahuja, Michael Bailey
November 13, 2001
http://arbornetworks.com/downloads/dark_address_space.pdf
[18] Tcl Developer Site
http://www.tcl.tk/
[19] Lua: an Extensible Embedded Language
Luiz Henrique de Figueiredo, Roberto Ierusalimschy, and Waldemar Celes
Dr. Dobb's Journal 21 #12 (Dec 1996) 26-33.
http://www.lua.org/ddj.html
[20] S-Lang Library Information Page
http://www.s-lang.org/
[21] The Network Simulator - ns-2
http://www.isi.edu/nsnam/ns/
[22] Models: Why?
Ben Bolker
14 Nov 2000
http://www.zoo.ufl.edu/bolker/eep-2000/si1.html
[23] Christmas 1988 Decnet Worm -- Counteracted
Cliff Stoll
RISKS-FORUM Digest Wednesday 4 January 1989 Volume 8 : Issue 2
27 Dec 88
ftp://ftp.sri.com/risks/8/risks-8.02
[24] A Brief Analysis of the ADM Internet Worm
Max Vision
circa May 1998
http://www.whitehats.com/library/worms/adm/index.html
[25] CAIDA Analysis of Code-Red
http://www.caida.org/analysis/security/code-red/#crv2
[26] OpenSSL Vulnerabilities
Johannes Ullrich
2002-09-16
http://isc.incidents.org/analysis.html?id=167
[27] Global Slapper Worm Information Center
F-Secure Corporation
http://www.europe.f-secure.com/slapper/
[28] The Logistic Equation and Integration by Partial Fractions
University of British Columbia Department of Mathematics
http://www.ugrad.math.ubc.ca/coursedoc/math101/notes/moreApps/logistic.html
[29] Euler's Method
Joel Feldman
http://www.math.ubc.ca/~feldman/math/odesolvers.pdf
[30] The SIR Model for Spread of Disease
David Smith and Lang Moore, Duke University
http://www.math.duke.edu/education/ccp/materials/postcalc/sir/contents.html
[31] Re: CodeGreen beta release (idq-patcher/antiCodeRed/etc.)
Markus Kern
Sept 1 2001
VULN-DEV email list
http://www.securityfocus.com/archive/82/211462
[32] Scalper and Slapper Worms Genealogy
David Goldsmith
2002-10-02
http://isc.incidents.org/analysis.html?id=177
[33] Lion Internet Worm Analysis
Max Vision
circa April, 2001
http://www.whitehats.com/library/worms/lion/index.html
[34] Incident Note IN-2001-05 - The "cheese" worm
Kevin Houle
Thursday, May 17, 2001
http://www.cert.org/incident_notes/IN-2001-05.html
[35] Advisory CA-2003-20 W32/Blaster worm
Chad Dougherty, Jeffrey Havrilla, Shawn Hernan, and Marty Lindner
August 14, 2003
http://www.cert.org/advisories/CA-2003-20.html
[36] Recode from disassembly of the Win32 DCOM worm
Rolf Rolles
August, 2003
VULN-DEV email list
http://lists.insecure.org/lists/vuln-dev/2003/Aug/att-0029/RPC_DCOM_recode_and_analysis.TXT
[37] IP v4 Address Space Utilization
Sean McCreary and kc claffy
8/27/98
http://www.caida.org/outreach/resources/learn/ipv4space/

Development Environment

I did not use large numbers of tools, or an expensive hardware setup to develop NWS.

SuSE 7.3 linux - http://www.suse.com
Perl v5.6.1 built for i586-linux - http://www.perl.org
The vim text editor - http://www.vim.org
The Gnuplot interactive plotting program - http://www.gnuplot.info/

The computer running SuSE 7.3 linux had a 1 GHz AMD Duron processor on a Micro ATX MS-6340M motherboard, with 256 Mb of memory. This computer ran 20, 135-time step NWS simulations in about 15 minutes wall clock time.