Brian Connelly: evolution Articles tagged 'evolution' on Brian Connelly en-us http://bconnelly.net Mon, 28 Nov 2016 15:10:40 -0800 Mon, 28 Nov 2016 15:10:40 -0800 Jekyll v3.3.1 Analyzing Microbial Growth with R Brian Connelly Wed, 09 Apr 2014 17:19:00 -0700 http://bconnelly.net/2014/04/analyzing-microbial-growth-with-r/ http://bconnelly.net/2014/04/analyzing-microbial-growth-with-r/ analysiscsvdatapubsdplyrevolutionfitnessggplot2growthhowtorreshape2 In experimental evolution research, few things are more important than growth. Both the rate of growth and the resulting yield can provide direct insights into a strain or species’ fitness. Whether one strain with a trait of interest can outgrow (and outcompete) another that possesses a variation of that trait often depends primarily on the fitnesses of the two strains.

]]>
In experimental evolution research, few things are more important than growth. Both the rate of growth and the resulting yield can provide direct insights into a strain or species’ fitness. Whether one strain with a trait of interest can outgrow (and outcompete) another that possesses a variation of that trait often depends primarily on the fitnesses of the two strains.

Zach Blount and his mountain of Petri dishes (B. Baer)
Zach Blount and his mountain of Petri dishes (Photo: Brian Baer)

Because of its importance both for painting the big picture and for properly designing experiments, I spend a very large portion of my time studying the growth of different candidate bacterial strains in different environments. This usually means counting colonies that grow on mountains of Petri dishes or by using a spectrophotometer to measure the absorbance of light as it passes through populations growing in clear microtiter plates. In science, replication is key, so once my eyes have glazed from counting colonies, or once the plate reader dings from across the lab to tell me that it’s done, my job becomes assembling all the data from the replicate populations of each strain and in each environment to try to figure out what’s going on. Do the growth rates show the earth-shattering result that I’m hoping for, or will I have to tweak my experimental design and go through it all again? The latter is almost always the case.

Because analyzing growth is both so fundamental to what I do, and because it is something I repeat ad nauseam, having a solid and easy-to-tweak pipeline for analyzing growth data is a must. Repeating the same analyses over and over again is not only unpleasant, but it also eats up a lot of time.

Here, I’m going to describe my workflow for analyzing growth data. It has evolved quite a bit over the past few years, and I’m sure it will continue to do so. As a concrete example, I’ll use some data from a kinetic read in a spectrophotometer, which means I have information about growth for each well in a 96-well microtiter plate measured periodically over time. I have chosen a more data dense form of analysis to highlight how easy it can be to analyze these data. However, I use the same pipeline for colony counts, single reads in the spec, and even for data from simulations. The following is an overview of the process:

  • Import the raw data, aggregating from multiple sources if necessary
  • Reformat the data to make it “tidy”: each record corresponds to one observation
  • Annotate the data, adding information about experimental variables
  • Group the replicate data
  • Calculate statistics (e.g., mean and variation) for each group
  • Plot the data and statistics

This workflow uses R, which is entirely a matter of preference. Each of these steps can be done in other environments using similar tools. I’ve previously written about grouping data and calculating group summaries in Summarizing Data in Python with Pandas, which covers the most important stages of the pipeline.

If you’d like to work along, I’ve included links for sample data in the section where they’re used. The entire workflow is also listed at the end, so you can easily copy and paste it into your own scripts.

For all of this, you’ll need an up-to-date version of R or RStudio. You’ll also need the dplyr, reshape2, and ggplot2 packages, which can be installed by running the following:

install.packages(c('reshape2', 'dplyr', 'ggplot2'))

The Initial State of the Data

To get started, I’ll put on my TV chef apron and pull some pre-cooked data out of the oven. This unfortunate situation occurs because all of the different software that I’ve used for reading absorbance measurements export data in slightly different formats. Even different versions of the same software can export differently.

So we’ll start with this CSV file from one of my own experiments. Following along with my data will hopefully be informative, but it is no substitute for using your own. So if you’ve got it, I really hope you will use your own data instead. Working this way allows you to see how each of these steps transforms your data, and how the process all comes together. For importing, try playing with the formatting options to read.table. There’s also a rich set of command line tools that make creating and manipulating tabular data quick and easy if that’s more your style. No matter how you get there, I’d recommend saving the data as a CSV (see write.csv) as soon as you’ve dealt with the import so that you never again have to face that unpleasant step.

rawdata <- read.csv("data/growth-raw.csv")

Each row of the example file contains the Time (in seconds), Temperature (in degrees Celsius), and absorbance readings at 600 nm for the 96 wells in a microtiter plate over a 24-hour period. These 96 values each have their own column in the row. To see the layout of this data frame, run summary(rawdata). Because of its large size, I’m not including the output of that command here.

Even microtiter plates can be mountainous, as Peter Conlin's bench shows
Even microtiter plates can be mountainous, as Peter Conlin's bench shows.

No matter if your software exports as text, XML, or something else, this basic layout is very common. Unfortunately, it’s also very difficult to work with, because there’s no easy way to add more information about what each well represents. In order to be aware of which well corresponds to which treatment, you’ll most likely have to keep referring either to your memory, your lab notes, or something else to remind yourself how the plate was laid out. Not only is this very inconvenient for analysis—your scripts will consist of statements like treatment_1_avg <- (B4 + E6 + H1) / 3, which are incomprehensible in just about all contexts besides perhaps Battleship—but it also almost guarantees a miserable experience when looking back on your data after even just a few days. In the next step, we’ll be re-arranging the data and adding more information about the experiment itself. Not only will this make the analysis much easier, but it’ll also help sharing the data with others or your future self.

Tidying the Data

As we saw before, our original data set contains one row for each point in time, where each row has the absorbance value for each of our 96 wells. We’re now going to follow the principles of Tidy Data and rearrange the data so that each row contains the value for one read of one well. As you will soon see, this means that each point in time will correspond to 96 rows of data.

To do this rearranging, we’re going to be using the melt function from reshape2. With melt, you specify which columns are identity variables, and which columns are measured variables. Identity variables contain information about the measurement, such as what was measured (e.g., which strain or environment), how (e.g., light absorbance at 600 nm), when, and so on. These are kind of like the 5 Ws of Journalism for your experiment. Measured variables contain the actual values that were observed.

library(reshape2)

reshaped <- melt(rawdata, id=c("Time", "Temperature"), variable.name="Well",
                 value.name="OD600")

In the example data, our identity variables are Time and Temperature, while our measured variable is absorbance at 600 nm, which we’ll call OD600. Each of these will be represented as a column in the output. The output, which we’re storing in a data frame named reshaped, will also contain a Well column that contains the well from which data were collected. The Well value for each record will correspond to the name of the column that the data came from in the original data set.

Now that our data are less “wide”, we can take a peek at its structure and its first few records:

summary(reshaped)
 
       Time        Temperature        Well            OD600       
  Min.   :    0   Min.   :28.2   A1     :  4421   Min.   :0.0722  
  1st Qu.:20080   1st Qu.:37.0   A2     :  4421   1st Qu.:0.0810  
  Median :42180   Median :37.0   A3     :  4421   Median :0.0970  
  Mean   :42226   Mean   :37.0   A4     :  4421   Mean   :0.3970  
  3rd Qu.:64280   3rd Qu.:37.0   A5     :  4421   3rd Qu.:0.6343  
  Max.   :86380   Max.   :37.1   A6     :  4421   Max.   :1.6013  
                                 (Other):397890
head(reshaped)
 
   Time Temperature Well  OD600
 1    0        28.2   A1 0.0777
 2   20        28.9   A1 0.0778
 3   40        29.3   A1 0.0779
 4   60        29.8   A1 0.0780
 5   80        30.2   A1 0.0779
 6  100        30.6   A1 0.0780

There’s a good chance that this format will make you a little bit uncomfortable. How are you supposed to do things like see what the average readings across wells B4, E6, and H1 are? Remember, we decided that doing it that way—although perhaps seemingly logical at the time—was not the best way to go because of the pain and suffering that it will cause your future self and anyone else who has to look at your data. What’s so special about B4, E6, and H1 anyway? You may know the answer to this now, but will you in 6 months? 6 days?

Annotating the Data

Based solely on the example data set, you would have no way of knowing that it includes information about three bacterial strains (A, B, and C) grown in three different environments (1, 2, and 3). Now we’re going to take advantage of our newly-rearranged data by annotating it with this information about the experiment.

One of the most important pieces of this pipeline is a plate map, which I create when designing any experiments that use microtiter plates (see my templates here and here. These plate maps describe the experimental variables tested (e.g., strain and environment) and what their values are in each of the wells. I keep the plate map at my bench and use it to make sure I don’t completely forget what I’m doing while inoculating the wells.

A plate map for our example data

For the analysis, we’ll be using a CSV version of the plate map pictured. This file specifies where the different values of the experimental variables occur on the plate. Its columns describe the wells and each of the experimental variables, and each row contains a well and the values of the experimental variables for that well.

In this sample plate map file, each row contains a well along with the letter of the strain that was in that well and the number of the environment in which it grew. If you look closely this plate map, you’ll notice that I had four replicate populations for each treatment. In some of the wells, the strain is NA. These are control wells that just contained growth medium. Don’t worry about these, we’ll filter them out later on.

platemap <- read.csv("data/platemap.csv")
head(platemap, n=10)
 
    Well Strain Environment
 1    B2      A           1
 2    B3      B           1
 3    B4      C           1
 4    B5   <NA>           1
 5    B6      A           2
 6    B7      B           2
 7    B8      C           2
 8    B9   <NA>           2
 9   B10      A           3
 10  B11      B           3

We can combine the information in this plate map with the reshaped data by pairing the data by their Well value. In other words, for each row of the reshaped data, we’ll find the row in the plate map that has the same Well. The result will be a data frame in which each row contains the absorbance of a well at a given point in time as well as information about what was actually in that well.

To combine the data with the plate map, we’ll use the inner_join function from dplyr, indicating that Well is the common column. Inner join is a term from databases that means to find the intersection of two data sets.

library(dplyr)

# Combine the reshaped data with the plate map, pairing them by Well value
annotated <- inner_join(reshaped, platemap, by="Well")

# Take a peek at the first few records in annotated
head(annotated)
 
   Time Temperature Well  OD600 Strain Environment
 1    0        28.2   B2 0.6100      A           1
 2   20        28.9   B2 0.5603      A           1
 3   40        29.3   B2 0.1858      A           1
 4   60        29.8   B2 0.1733      A           1
 5   80        30.2   B2 0.1713      A           1
 6  100        30.6   B2 0.1714      A           1

This produces a new table named annotated that contains the combination of our absorbance data with the information from the plate map. The inner join will also drop data for all of the wells in our data set that do not have a corresponding entry in the plate map. So if you don’t use a row or two in the microtiter plate, just don’t include those rows in the plate map (there’s nothing to describe anyway). Since the inner join takes care of matching the well data with its information, an added benefit of the plate map approach is that it makes data from experiments with randomized well locations much more easy to analyze (unfortunately, it doesn’t help with the pipetting portion of those experiments).

Let’s pause right here and save this new annotated data set. Because it contains all information related to the experiment—both the measured variables and the complete set of identity variables—it’s now in an ideal format for analyzing and for sharing.

# Write the annotated data set to a CSV file
write.csv(annotated, "data-annotated.csv")

Grouping the Data

Now that the data set is annotated, we can arrange it into groups based on the different experimental variables. With the example data set, it makes sense to collect the four replicate populations of each treatment at each time point. Using this grouping, we can begin to compare the growth of the different strains in the different environments over time and make observations such as “Strain A grows faster than Strain B in Environment 1, and slower than Strain B in Environment 2. In other words, we’re ready to start learning what the data have to tell us about our experiment.

For this and the following step, we’re once again going to be using the dplyr package, which contains some really powerful (and fast) functions that allow you to easily filter, group, rearrange, and summarize your data. We’ll group the data by Strain, then by Environment, and then by Time, and store the grouping in grouped. As shown in the Venn diagram, this means that we’ll first separate the data based on the strain. Then we’ll separate the data within each of those piles by the environment. Finally, within these smaller collections, we’ll group the data by time.

grouped <- group_by(annotated, Strain, Environment, Time)
Grouping the data by strain, environment, and time
Grouping the data by Strain, Environment, and Time

What this means is that grouped contains all of the growth measurements for Strain A in Environment 1 at each point in Time, then all of the measurements for Strain A in Environment 2 at each point in Time, and so on. We’ll use this grouping in the next step to calculate some statistics about the measurements. For example, we’ll be able to calculate the average absorbance among the four replicates of Strain A</span> in Environment 1 over time and for each of the other treatments.

Calculating Statistics for Each Group

Now that we have our data partitioned into logical groups based on the different experimental variables, we can calculate summary statistics about each of those groups. For this, we’ll use dplyr’s summarise function, which allows you to execute one or more functions on any of the columns in the grouped data set. For example, to count the number of measurements, the average absorbance (from the OD600 column), and the standard deviation of absorbance values:

stats <- summarise(grouped, N=length(OD600), Average=mean(OD600), StDev=sd(OD600))

The resulting stats data set contains a row for each of the different groups. Each row contains the Strain, Environment, and Time that define that group as well as our sample size, average, and standard deviation, which are named N, Average, and StDev, respectively. With summarise, you can use apply any function to the group’s data that returns a single value, so we could easily replace the standard deviation with 95% confidence intervals:

# Create a function that calculates 95% confidence intervals for the given
# data vector using a t-distribution
conf_int95 <- function(data) {
    n <- length(data)
    error <- qt(0.975, df=n-1) * sd(data)/sqrt(n)
    return(error)
}

# Create summary for each group containing sample size, average OD600, and
# 95% confidence limits
stats <- summarise(grouped, N=length(OD600), Average=mean(OD600),
                   CI95=conf_int95(OD600))

Combining Grouping, Summarizing, and More

One of the neat things that dplyr provides is the ability to chain multiple operations together using magrittr’s %>% operator. This allows us to combine the grouping and summarizing from the last two steps (and filtering, sorting, etc.) into one line:

stats <- annotated %>%
          group_by(Environment, Strain, Time) %>%
          summarise(N=length(OD600), 
                    Average=mean(OD600),
                    CI95=conf_int95(OD600)) %>%
          filter(!is.na(Strain))

Note that I’ve put the input data set, annotated, at the beginning of the chain of commands and that group_by and summarise no longer receive an input data source. Instead, the data flows from annotated, through summarise, and finally through filter just like a pipe. The added filter removes data from the control wells, which had no strain.

Plotting the Results

Now that we have all of our data nicely annotated and summarized, a great way to start exploring it is through plots. For the sample data, we’d like to know how each strain grows in each of the environments tested. Using the ggplot2 package, we can quickly plot the average absorbance over time:

ggplot(data=stats, aes(x=Time/3600, y=Average, color=Strain)) + 
       geom_line() + 
       labs(x="Time (Hours)", y="Absorbance at 600 nm")

single plot

The obvious problem with this plot is that although we can differentiate among the three strains, we can’t see the effect that environment has. This can be fixed easily, but before we do that, let’s quickly dissect what we did.

We’re using the ggplot function to create a plot. As arguments, we say that the data to plot will be coming from the stats data frame. aes allows us to define the aesthetics of our plot, which are basically what ggplot uses to determine various visual aspects of the plot. In this case, the x values will be coming from our Time column, which we divide by 3600 to convert seconds into hours. The corresponding y values will come from the Average column. Finally, we will color things such as lines, points, etc. based on the Strain column.

The ggplot function sets up a plot, but doesn’t actually draw anything until we tell it what to draw. This is part of the philosophy behind ggplot: graphics are built by adding layers of different graphic elements. These elements (and other options) are added using the + operator. In our example, we add a line plot using geom_line. We could instead make a scatter plot with geom_point, but because our data are so dense, the result isn’t quite as nice. We also label the axes using labs.

Back to the problem of not being able to differentiate among the environments. While we could use a different line type for each environment (using the linetype aesthetic), a more elegant solution would be to create a trellis chart. In a trellis chart (also called small multiples by Edward Tufte), the data are split up and displayed as individual subplots. Because these subplots use the same scales, it is easy to make comparisons. We can use ggplot’s facet_grid to create subplots based on the environments:

ggplot(data=stats, aes(x=Time/3600, y=Average, color=Strain)) + 
       geom_line() + 
       facet_grid(Environment ~ .) +
       labs(x="Time (Hours)", y="Absorbance at 600 nm")
Trellis plot showing the growth of the strains over time for each environment
Trellis plot showing the growth of the strains over time for each environment

Let’s take it one step further and add shaded regions corresponding to the confidence intervals that we calculated. Since ggplot builds plots layer-by-layer, we’ll place the shaded regions below the lines by adding geom_ribbon before using geom_line. The ribbons will choose a fill color based on the Strain and not color the edges. Since growth is exponential, we’ll also plot our data using a log scale with scale_y_log10:

ggplot(data=stats, aes(x=Time/3600, y=Average, color=Strain)) +
       geom_ribbon(aes(ymin=Average-CI95, ymax=Average+CI95, fill=Strain),
                   color=NA, alpha=0.3) + 
       geom_line() +
       scale_y_log10() +
       facet_grid(Environment ~ .) +
       labs(x="Time (Hours)", y="Absorbance at 600 nm")
Our final plot showing the growth of each strain as mean plus 95% confidence intervals for each environment
Our final plot showing the growth of each strain as mean plus 95% confidence intervals for each environment

In Conclusion

And that’s it! We can now clearly see the differences between strains as well as how the environment affects growth, which was the overall goal of the experiment. Whether or not these results match my hypothesis will be left as a mystery. Thanks to a few really powerful packages, all it took was a few lines of code to analyze and plot over 200,000 data points.

I’m planning to post a follow-up in the near future that builds upon what we’ve done here by using grofit to fit growth curves.

Complete Script

library(reshape2)
library(dplyr)
library(ggplot2)

# Read in the raw data and the platemap. You may need to first change your
# working directory with the setwd command.
rawdata <- read.csv("data/growth-raw.csv")
platemap <- read.csv("data/platemap.csv")

# Reshape the data. Instead of rows containing the Time, Temperature,
# and readings for each Well, rows will contain the Time, Temperature, a
# Well ID, and the reading at that Well.
reshaped <- melt(rawdata, id=c("Time", "Temperature"), variable.name="Well", 
                 value.name="OD600")

# Add information about the experiment from the plate map. For each Well
# defined in both the reshaped data and the platemap, each resulting row
# will contain the absorbance measurement as well as the additional columns
# and values from the platemap.
annotated <- inner_join(reshaped, platemap, by="Well")

# Save the annotated data as a CSV for storing, sharing, etc.
write.csv(annotated, "data-annotated.csv")

conf_int95 <- function(data) {
    n <- length(data)
    error <- qt(0.975, df=n-1) * sd(data)/sqrt(n)
    return(error)
}

# Group the data by the different experimental variables and calculate the
# sample size, average OD600, and 95% confidence limits around the mean
# among the replicates. Also remove all records where the Strain is NA.
stats <- annotated %>%
              group_by(Environment, Strain, Time) %>%
              summarise(N=length(OD600),
                        Average=mean(OD600),
                        CI95=conf_int95(OD600)) %>%
              filter(!is.na(Strain))

# Plot the average OD600 over time for each strain in each environment
ggplot(data=stats, aes(x=Time/3600, y=Average, color=Strain)) +
       geom_ribbon(aes(ymin=Average-CI95, ymax=Average+CI95, fill=Strain),
                   color=NA, alpha=0.3) + 
       geom_line() +
       scale_y_log10() +
       facet_grid(Environment ~ .) +
       labs(x="Time (Hours)", y="Absorbance at 600 nm")

Extending to Other Types of Data

I hope it’s also easy to see how this pipeline could be used in other situations. For example, to analyze colony counts or a single read from a plate reader, you could repeat the steps exactly as shown, but without Time as a variable. Otherwise, if there are more experimental variables, the only change needed would be to add a column to the plate map for each of them.

Acknowledgments

I’d like to thank Carrie Glenney and Jared Moore for their comments on this post and for test driving the code. Many thanks are also in order for Hadley Wickham, who developed each of the outstanding packages used here (and many others).

Save this post as a PDF

]]>
http://bconnelly.net/2014/04/analyzing-microbial-growth-with-r/#comments
When Cooperating Means Just Saying No Brian Connelly Thu, 20 Jun 2013 14:24:00 -0700 http://bconnelly.net/2013/06/when-cooperating-means-just-saying-no/ http://bconnelly.net/2013/06/when-cooperating-means-just-saying-no/ BEACONcooperationquorum sensingevolutionPseudomonas aeruginosa This post originally appeared on the BEACON Blog on May 13, 2013.

Evolutionary biologists often talk like economists, particularly when the topic is cooperation. Instead of dollars, euros, or pounds, the universal currency in evolution is fitness. A species that cooperates cannot survive when competing against a non-cooperative opponent unless the fitness benefits provided by cooperation, such as those resulting from greater access to resources, outweigh the costs. To make matters more complicated, cooperative benefits often take the form of “public goods,” which benefit all nearby individuals, whether cooperator or not. This sets the stage for the emergence of “cheaters”, which exploit the cooperation of others without contributing themselves. Despite cooperation seeming at odds with the notion of “survival of the fittest”, we now have a good understanding of how cooperation can persist in the face of cheaters based on the tremendous work of Fischer, Haldane, Hamilton, Price, and those who have since followed. When the costs and benefits are favorable, and when close relatives are more likely to receive those benefits, cooperation can survive and even thrive.

Lab group
Just another day in the lab. Making plates with Belen Mesele (L) and Helen Abera (R), two of the people working on the project with me. Our wild-type cooperator strains produce beautiful blue-green colonies due to the production of pyocyanin, another behavior regulated by quorum sensing.

Environments are always changing, and since the environment plays a dominant role in determining the fitness costs and benefits associated with all traits, natural selection may quickly change between favoring cooperation and not. When the balance shifts so that cooperation becomes more costly than beneficial, cooperators risk being driven to extinction by cheaters or other non-cooperators that do not pay those costs. So how can cooperators survive these tough times? The answer is frustratingly simple—by not cooperating. The challenge, though, is in determining when to cooperate and when to be more self-centered. We humans and other primates are—perhaps very arguably—good at estimating whether or not cooperation will benefit ourselves and those with whom we are similar, either genetically or in our beliefs. We are able to do this by integrating a great deal of information about our world and the people in it. But we are not alone in this.

Surprisingly, it turns out that even relatively “simple” bacteria are extremely effective at determining whether or not to cooperate based on the state of their environment and the composition of their population. One of the ways that these bacteria accomplish this is through quorum sensing. With quorum sensing, individuals communicate with each other by releasing and detecting small molecules, which are used as signals. When an individual detects low levels of the signal, it can use this information to assume either that there are too few other cooperators nearby to produce sufficient benefits by cooperating, or that the public good will be flushed out of the environment before it can be used. However, when that individual detects high levels of the signal, it is likely that there are many relatives nearby that would benefit from cooperation. By communicating this way using signals specific to their own species, bacteria use quorum sensing to rapidly adjust their behaviors to maximize their fitness as the environment changes.

Josie Chandler recently wrote about her fascinating work that addressed how bacteria use quorum sensing to control the production of antibiotics. While she investigated this process as a means of competing with other species, it can also be viewed as a form of cooperation among members of the same species. By using antibiotics to kill off competitors, sometimes self-sacrificially, more resources become available to those that remain. And because species often have resistance to the antibiotics that they produce, those that remain after an antibiotic attack are likely to be close relatives.

The production of antibiotics is just one example of a behavior controlled by quorum sensing. Since its discovery in the early 1970s, quorum sensing has been observed across a wide variety of species. Among the behaviors regulated by quorum sensing, those related to cooperation and other social interactions are perhaps the most prevalent. Because of this, quorum sensing is believed to play a key role in allowing cooperation to persist in ever-changing environments.

Plated differences in colony morphologies
Colonies formed by two of our strains. Through the production of elastase, our cooperators are able to break down the proteins present in this milk agar plate, forming large, clear halos. Our non-cooperator strain does not produce elastase, so it is unable to break down the milk proteins, and a much smaller halo is produced.

Although the connection between quorum sensing and cooperation is now well known, little is understood about how these behaviors became interlinked. To begin addressing this, I am currently working in Ben Kerr’s Lab on a number of projects that investigate the co-evolution of cooperation and quorum sensing. To gain a broader picture of this process, we’re pairing microbial experiments with computational and mathematical models.  The cooperative behavior we’re focusing on in our study system, Pseudomonas aeruginosa, is the production of the digestive enzyme elastase. When secreted into the environment as a public good, elastase breaks down large proteins into smaller, usable sources of nutrients available to all cells in the surrounding area. In environments where these large and otherwise inaccessible proteins are the main nutrient source, this behavior is extremely beneficial. (Here is a short video demonstrating growth of our bacteria.)

Pseudomonas faces
In these environments, where the proteins are a limited source of resource, cooperators do better due to the benefits provided by elastase. We can measure the amount of cooperation occurring within populations by examining the size of the clearing that occurs when extracting and plating the elastase that is produced. Note: faces add no scientific value.

By exposing our populations to different environments over many generations, we are directly observing how communication and cooperation co-evolve. Through these experiments, we are investigating how quorum sensing enables cooperation to be maintained, the types of environments in which this occurs, and the different ways in which this regulation can occur. We hope that through this work, we can gain a greater understanding of the complex social processes that occur in natural ecosystems and in some of the infections that create tremendous health challenges.

]]>
http://bconnelly.net/2013/06/when-cooperating-means-just-saying-no/#comments
SEEDS Paper Published at EvoSoft Workshop Brian Connelly Tue, 31 Jul 2012 22:26:00 -0700 http://bconnelly.net/2012/07/seeds-paper-published-at-evosoft-workshop/ http://bconnelly.net/2012/07/seeds-paper-published-at-evosoft-workshop/ papersresearchsoftwareartifical lifeseedsgeccoevolution I recently went to Philadelphia for GECCO 2012, where I presented my paper The SEEDS Platform for Evolutionary and Ecological Simulations at the EvoSoft workshop. It was great to talk with people about modern Evolutionary Computation Software Systems, especially the DEAP team, who share a lot of my ideas.

]]>
I recently went to Philadelphia for GECCO 2012, where I presented my paper The SEEDS Platform for Evolutionary and Ecological Simulations at the EvoSoft workshop. It was great to talk with people about modern Evolutionary Computation Software Systems, especially the DEAP team, who share a lot of my ideas.

]]>
The Role of Environment in the Evolution of Cooperation Brian Connelly Mon, 25 Apr 2011 07:00:00 -0700 http://bconnelly.net/2011/04/the-role-of-environment-in-the-evolution-of-cooperation/ http://bconnelly.net/2011/04/the-role-of-environment-in-the-evolution-of-cooperation/ avidaBEACONbiolumecooperationseedsevolutionVibrio choleraeresearch This post originally appeared on the BEACON Blog on April 25, 2011.

Cooperation is something that most people take for granted. It’s woven into just about every part of our lives. Our societies have even developed a wide variety of measures to make sure we’re cooperating, such as punishing those that don’t. This level of cooperation isn’t reserved to humans. Cooperation plays a vital role in nearly all forms of life, from our primate cousins to ants and termites, and all the way down to simple microorganisms such as bacteria. There’s even an astounding amount of cooperation going on within our bodies. Amazingly, of the ten trillion or so cells in the human body, over 90% of those are bacterial cells made up of thousands of different species.

Brian at the Trier Amphitheater
Brian at the Trier Amphitheater

While it’s easy to find examples of cooperation in nature, understanding how cooperation got its roots, how it evolved, and how it is maintained are very tricky questions, especially when viewing evolution as “survival of the fittest”.  If the goal is to outcompete everyone, why would one want to pay some costs to help others?  This is a question evolutionary biologists have been asking since Darwin, who wrote “If it could be proved that any part of the structure of any one species had been formed for the exclusive good of another species, it would annihilate my theory, for such could not have been produced through natural selection”.

Over the years, a lot has been learned about cooperation.  Most of this knowledge has come from studying cooperation using mathematical and computational models or by studying organisms in lab environments.  The problem with these methods, though, is that they only examine cooperation in contexts that don’t necessarily match real world situations.

My research focuses on understanding the different ways in which the environment can affect the evolution of cooperation.  Peter and Rosemary Grant summed this up nicely when they wrote, mimicking a famous quote by Theodosius Dobzhansky, “Nothing in evolutionary biology makes sense except in the light of ecology.”

The benefits of understanding how cooperation is maintained are huge.  For billions of years, life existed only as single-celled organisms.  At some point, cells began cooperating with each other, and our first multicellular ancestors emerged.  Cooperation among bacteria also plays a large role in diseases like cholera, which killed over 100,000 people in 2010.  A substantial factor in the spread of cholera is quorum sensing, a cooperative process that bacteria use to coordinate behaviors.  By understanding how cooperation works in infections like Cholera, treatments can potentially be designed to disrupt cooperation, and perhaps lessen the strength of the infection or limit its spread.  Further, by understanding how the environment affects this behavior, researchers will have a better idea of how their results in laboratory environments will translate to natural environments like the body.

Simulation
In simulations of cooperative behaviors, cooperators exist in patches which are constantly invaded by cheaters, or those that take advantage of the cooperation without themselves contributing.

My background is in computer science, so to start understanding how the environment can affect cooperation, I’ve used computational models of cooperation in Avida and SEEDS an open source package I’ve co-developed. My initial models looked at the role that environmental disturbance plays in cooperation and demonstrated that cooperation increases as environmental conditions worsen.

Some of my other work examined the effect that the amount of resource present in the environment has on cooperation.  We found that the more resource an individual had, the more likely they were to cooperate, since the costs relative to their wealth decreased.  This only occurred after a certain point, though.  Below this point, it the benefits provided by cooperation just didn’t outweigh the costs, so no cooperation occurred.

Another study looked at how the number of social interactions one has affects a population’s ability to maintain cooperation and diversity.  Here we found that as the number of interactions go up, at one point populations quickly lose the ability to maintain diversity.  Although these results were targeted at a small system, I still wonder if they could tell us anything about the direction our increasingly-connected society is heading.

View this video at https://www.youtube.com/embed/r80RMW4F4FM

One of the really outstanding aspects of both BEACON and MSU is the opportunity for collaboration.  I’m extremely fortunate to have an advisor, Dr. Philip McKinley who personifies this spirit of collaboration.  One such collaboration that he initiated was a meeting with Dr. Chris Waters, a fairly new faculty member in the Department of Microbiology and Molecular Genetics. This was at a point where I’d finished some of my initial computational work on cooperation and had become familiar with how cooperative behaviors were being studied using microorganisms.  Meeting with Chris was really exciting for me, since I’d known about some of his earlier work with quorum sensing in bacteria.

Plates of Vibrio cholerae used to measure cooperation in different environments
Plates of Vibrio cholerae used to measure cooperation in different resource environments

What I didn’t expect to happen was that Chris offered me the opportunity to start asking the same kinds of questions about how environment affects cooperation in his lab – using real bacteria!  Now, I’ve always been the kind of person who gets excited about learning and trying new things, so I was thrilled.  Still, my microbiology background was nonexistent, and pretty much the only thing I remembered about biology (which I hadn’t taken since my freshman year of high school) was how to draw the stages of mitosis. Fortunately, Chris was really helpful at getting me started, and with the help of other people in the lab, I was able to perform some initial experiments. I’m now at a point where I’m performing some pretty complex (although maybe just to me) experiments that I designed based on what I’d learned.  I’ve seen first hand that what I do in the wet lab improves and inspires my computational work, and that the computational work can also improve and inspire the wet lab work.  I’m hoping that this sets the pace for the rest of my career.  I don’t know if I’ll ever not feel at least a little like an outsider in a microbiology lab, but I know I want to continue approaching problems from multiple perspectives.  Great collaborations really make that possible. There’s an enormous amount of exciting research going on within BEACON, but I’m equally excited about the possibilities for outreach and education.  Because evolution usually takes place on very long time scales, it can be extremely hard to demonstrate processes such as selection in a way that’s seen and understood within a few minutes.  When this is accomplished, though, evolution moves away from being just a vague concept to people and becomes a whole lot more approachable.  Sometimes, this means stripping away the notions of what life is based on our limited set of examples on earth and looking to alternate worlds.

Biolume project. Rendering by Adam Brown.
Biolume project. Rendering by Adam Brown.
One unique opportunity that being a part of this community has afforded me is a collaboration with BEACON’s artist in residence, [Adam Brown](http://adamwbrown.net/), for his [Biolume](http://adamwbrown.net/projects-2/biolume/) project.  In this project, glowing, sensing, noisy, and evolving robotic units will be attached to the walls and interact with each other and with people who walk by.  Once I found out that Adam was planning to create large populations of these Biolumes, I was immediately excited by the possibility of evolving behaviors on these robots in a way that visitors could observe and, most importantly, affect!  I can’t think of a better way for people to learn about topics like natural selection than to participate in the process of selection, and define which behaviors are beneficial in the environment and which ones should quickly lead to extinction.  

]]>
http://bconnelly.net/2011/04/the-role-of-environment-in-the-evolution-of-cooperation/#comments