11.12.2009
Neuro musings, part 1: neurobiology, psychology, and the missing link(s)
Part 1: neurobiology, psychology, and the missing link(s)
The central problem of neuroscience is that despite all the advancements happening in medical science, we have embarrassingly few ways to quantify, or talk quantitatively about, mid-level functional differences between peoples' brains.
It's not that we have no tools at all for quantifying function and individual differences: we can draw correlations between specific genes and certain behavioral traits or neurophysiological features. We have the DSM IV (and soon, DSM V) as a sort of handbook on the symptoms of common brain-related problems. We have the Myers-Briggs and related personality-typing tests, we have psychometric tests, we have various scans that pick up gross neuroanatomy (and we can sometimes correlate this with behavioral deficits), and we have the fMRI, which can measure raw neural activity through the proxy of where blood flows in the brain.
The problem is that these methods of understanding brains are heavily clustered in two opposite areas: the reductionist neuroanatomical approach, which is great as far as it goes, but doesn't go far enough up the ladder of abstraction to explain much about everyday behavior, and the symptom-centric psychological approach, which may be a great description of how various people behave, or some common neural equilibria, but really explains very little.[1] There's a great deal of room in neuroscience for an ontology with which to talk about, and mid-level tools which attempt to measure and correlate things with, this underserved middle-level of brain function.[2]
Of course, the natural question regarding these mid-level approaches to understanding the brain is whether we can find ontologies and tools which can be said to "carve reality at its joints," or not be based on a terribly leaky level of abstraction (as, for example, the DSM IV fails at), yet have direct relevance to psychological events as we experience them in ourselves and in others (as, for example, the DSM IV does). I don't have any answers! But I do have ideas.
[1] To paraphrase Sir Karl Popper, implicit in any true explanation of a phenomenon is a prediction, and implicit in any prediction about a phenomenon is an explanation. So a good way to figure out how much of a field is true scientific explanation vs. 'mere stamp-collecting' is to check how much it deals with predictions, whether explicit or implicit. Psychology seems to be a primarily descriptive field that's attempting to translate its rich (yet predictively shallow) descriptive ontology into a more prediction-based science.
[2] I realize this is somewhat vague. I plan to expand this description of what I think of as "mid-level functional attributes" and the sorts of concepts and tools I think may be useful for dealing with them. One example of a mid-level measurement that struck me as promising was a work correlating lack of microstructural integrity in the uncinate fasciculus with psychopathy.
11.11.2009
HIC SVNT DRACONES
What's a Modern Dragon anyway?
Back in the Middle Ages, cartographers used to (anecdotally, at least) mark unknown or dangerous territories on their maps with the Latin phrase, HIC SVNT DRACONES-- literally, "Here be Dragons". By metaphor, then, the purpose of this blog is to locate, explore, and perhaps take a swing at the analogous dragons in our modern age-- the puzzles, frontiers, and dangerous elements within science, culture, and this terribly uncertain future of ours.
9.30.2009
Quote: on the evolution of reading
Here, I am reminded not of the recent past but of a huge change that occurred in the middle-ages when humans transformed their cognitive lives by learning to read silently. Originally, people could only read books by reading each page out loud. Monks would whisper, of course, but the dedicated reading by so many in an enclosed space must have been an highly distracting affair. It was St Aquinas who amazed his fellow believers by demonstrating that without pronouncing words he could retain the information he found on the page. At the time, his skill was seen as a miracle, but gradually human readers learned to read by keeping things inside and not saying the words they were reading out loud. From this simple adjustment, seemingly miraculous at the time, a great transformation of the human mind took place, and so began the age of intense private study so familiar to us now; whose universities where ideas could turn silently in large minds.Dr. Barry Smith, University of London, while discussing Edge Magazine's 2009 question, What will change everything?
Edit: a commenter has suggested it was actually St. Ambrose, not St. Aquinas, who first broke this ground.
9.12.2009
A simple and cheap proposal for improving American health
A simple and cheap proposal for improving American health:
In short, I'd like to see a federally-funded, state-by-state performance-based incentive program to improve public health. Specifically, the federal government sets aside a decent chunk of money and sets targets for curbing health problems: e.g., "Reduce the growth of childhood diabetes in your state by 50% by 2012" or "Reduce the growth of cardiovascular disease in your state by 40% by 2013." If state A meets the target, they get generous federal funds for doing so. If state B fails to meet the target, they don't. Ideally, this would generate a lot of creativity in actually solving the targeted problems (since real money for the state would be on the line), but states would also have incentive to copy what works.
This program might cost some money-- but we'd be paying for results: if it flopped and nobody hit these targets, well, it'd have cost nothing. On the other hand, if this program got results, even if we consider the money going to states to be 'wasted' the program would still be a net financial gain from perspective of decreased strain on our health systems. In other words, with a results-based incentive system, we have nothing to lose if it flops and plenty to gain if it works.
Now, I'm sure the devil would be in the details. We'd need to pick targets that are easy to representatively measure and hard to game. It also seems like we could have a yearly governors' conference revolving around this incentive program for states to share tips on what strategies are working and which aren't. Make this conference (and the incentive program in general) a big deal, and make it competitive-- make states proud of their successes and ashamed of their failures.
In general, it seems to me that this sort of grand state-by-state competition for funds could be extended to a lot of social problems. Since it incentivizes results instead of naive/bureaucratic thinking, it might encourage some smart, actionable analysis about the roots of various social problems. But that's something to explore another time. My point is, I think this would work really well for improving public health, and we should do it.
p.s. Anyone have a good way of getting this idea into the hands of some congressperson?
8.04.2009
Quote: China on China
Deng Xiaoping, the Chinese leader who ushered in its market reforms starting in the late 1970s, famously gave his country the following advice: “Observe calmly; secure our position; cope with affairs calmly; hide our capacities and bide our time; be good at maintaining a low profile; and never claim leadership.”This seems to be the general trend in Chinese foreign policy; if the Chinese leadership decide this is no longer necessary or desirable, we could suddenly live in a very different world.
I think a particularly interesting and volatile element to this is that the Chinese Government has a relatively solid hold on power, but this hold is largely tied to the year-over-year economic growth China has been experiencing for decades. The Chinese are content to tolerate their government because life is getting better, and looks to get better still. Should this growth dry up, there's no telling what may happen domestically, or what nationalistic conflicts the Chinese Government may enter into as a ploy to unify their people.
7.11.2009
On writing, and the beauty of archive.org
My latest archive.org-assisted rediscovery is of a wonderful little essay on the difficulty of writing vs programming by Paul Graham. Archive.org isn't google-searchable, and so when Graham deleted his infogami blog this gem vanished down the memory hole. I'll quote it in full for your pleasure and to get it back in circulation.
Paul GrahamWhy Writing is Harder than Programming
3 Oct 06
I spent most of this summer hacking a new version of Arc. That's why I haven't written any new essays lately. But I had to start writing again a few days ago because I have to give a talk at MIT tomorrow.
Switching back to writing has confirmed something I've always suspected: writing is harder than hacking. They're both hard to do well, but writing has an additonal element of panic that isn't there in hacking.
With hacking, you never have to worry how something is going to come out. Software doesn't "come out." If there's something you don't like, you change it. So programming has the same relaxing quality as building stuff out of Lego. You know you're going to win in the end. Succeeding is simply a matter of defining what winning is, and possibly spending a lot of time getting there. Those can be hard, but not frightening.
Whereas writing is like painting. You don't have the same total control over the medium. In fact, you probably wouldn't want it. When it's going well, painting from life is something you do in hardware. There are stretches where perception flows in through your eye and out through your hand, with no conscious intervention. If you tried to think consciously about each motion your hand was making, you'd just produce something stilted.
The result is that writing and painting have an ingredient that's missing in hacking and Lego: suspense. An essay can come out badly. Or at least, you worry it can.
I think good writers can push writing in the direction of Lego. As you get more willing to discard and rewrite stuff, you approach that feeling of total control you get with Lego and hacking. If there's something you don't like, you change it. At least, as I've gotten better at writing that's what's happened to me. I've become much more willing to throw stuff away.
But though you get closer to the calmness of hacking, you never get there. What a difference it is walking into the Square to get a cup of tea with a half-finished essay in my head, rather than a half-finished program. A half-finished program is a pleasing diversion-- a puzzle. A half-finished essay is mostly just a worry. What if you don't think of the other half?
It's possible that hacking is only easy because we have poor tools (and low expectations to match). Maybe if you had really powerful tools you'd tell a computer what to do in a way that was more like writing or painting. Lego, pleasing as it is, can't do what oil paint can. That would be an alarming variant of hundred year language: one that was as powerful and as frightening as prose. But that's exactly the sort of trick the future tends to play on you.
7.02.2009
Quote: on the economic situation
The biggest problem today [in our economic situation] is that nobody really knows what the value of anything is.- Kermit Johnson
(Why yes, Dad, I *do* listen!)
6.29.2009
Our broken grant system
“Scientists don’t like talking about it publicly,” because they worry that their remarks will be viewed as lashing out at the health institutes, which supports them, said Dr. Richard D. Klausner, a former director of the National Cancer Institute.John Hawks has some clever and good commentary on the situation, bringing in some evolutionary theory about search space and fitness peaks to support the point that yes, we're funding the wrong sorts of grant proposals when we go for timid, incremental projects given our current state of knowledge.
But, Dr. Klausner added: “There is no conversation that I have ever had about the grant system that doesn’t have an incredible sense of consensus that it is not working. That is a terrible wasted opportunity for the scientists, patients, the nation and the world.”
A pie-in-the-sky idea
As sort of an ideal-world scenario, instead of routing all proposals through the most established and senior of scientists, I'd like to see a modest amount of future NIH funding be set aside and overseen by graduate students in seminars across the country. Essentially, students could sign up for a seminar where their coursework would be to analyze a set of grant applications pertaining to their field, learn about the science in each grant and about the grant system, and finally select the top 1-2 grants to be funded. The professor teaching the class would be in charge of the syllabus, but with the following three guidelines:
1. Attempt to choose the best grant proposals;
2. The students, not the professor, have the final say in which proposals get funded;
3. Use the class as a teaching tool for both the science involved in the grants, and the grant system itself.
The set of grant applications to evaluate could be drawn from the pool of applications the NIH has rejected, but still deems interesting and not based on bad science.
There would be a million details to fill in, but I guarantee this system would be consistently fresh and open to new ideas (I don't know if anyone has noticed this, but grad students are really smart and creative!), yet would still be grounded in science and experience. It'd also be a fantastic teaching tool.
6.28.2009
Now leaving Era of the Mystery. All aboard for Era of the Tool.
- Solve an outstanding mystery;
- Gather and publishing new data;
- Construct a new tool.
Gathering and publishing new data has constituted, and will constitute for the forseeable future, the majority of scientific publication. Science has a healthy and voracious appetite for data, and this isn't likely to change anytime soon. The interesting thing about progress in science today though, and the topic of this post, is the balance between the first and third sort of approach, mystery vs tool.
Era of the Scientific Mystery
By and large, the emphasis in science used to be on solving mysteries. Discovering the mechanism of genetic inheritance; decoding the structure of DNA; deciphering how viruses take over cells. Scientists were billed as detectives, and the height of scientific achievement was to find an "aha" insight that solved an outstanding mystery. But- though some scientists may voraciously deny this- we've been so successful at solving the fundamental mysteries out there that we're running out of this kind of mystery in many branches of science. In turn, science is gradually becoming less about solving foundational unknowns (like decoding the structure of DNA) and more about creating tools by which to more richly and more quantifiedly understand what is no longer mysterious but too complex to trust to our intuitions and simple equations.
Era of the Scientific Tool
Scientific progress has always had a strong tool component. Grind a better lens, see the stars better, and create a more accurate description of the galaxy; build a free-swinging pendulum, observe the shifting plane of motion, and conclude the Earth is not fixed but rotates. These sort of things were not uncommon in the history of science. But there seems to be a sea change happening that modern scientific publication is beginning to center around devising and applying tools that in turn generate interesting results.
Two examples of this from my own experience are the recent publications of a couple friends who are scientists, John Hawks (UW Madison) and Bryan W. Jones (U Utah).
Hawks made waves with a recent publication, Recent Acceleration of Human Adaptive Evolution, which applied an established genetics tool (linkage disequilibrium) to the context of the human genome and came to the conclusion that not only did human evolution not stop with the advent of civilization, but that it actually sped up over a hundredfold.
Jones just published A Computational Framework for Ultrastructural Mapping of Neural Circuitry, a work which defined a new integrative workflow which enabled, for the first time, the mapping of a large-scale neural connectome, and offered the first product of this workflow, a connectome map of a rabbit's retina.
Tools are absolutely central to both publications: the first is based on the novel application of an existing tool to a context it hadn't been applied in, and the second involved inventing a new tool to enable the generation of new datasets.
These examples are anecdotal, to be sure-- but it seems that although the meme of the scientific mystery will be with us for a long time, and though there are sporadic fundamental unknowns yet to discover, increasingly the really sexy, generative results in science involve creating or repurposing a tool to shed new light on some data, or generate data at an exponentially faster rate.
In short? Science is no longer about mysteries but about problems. And given the right tool, problems solve themselves.
Notes:
- Kevin Kelly's Speculations on the Future of Science is an interesting survey of possible tools science may grow into.
6.11.2009
Brainstorm: Logarithmic Evolution Distance
Exponential advances in gene sequencing technology have produced an embarrassment of riches: we're now able to almost trivially sequence an organism's DNA, yet sifting meaning from these genomes is still an incredibly intensive and haphazard task. For instance, consider the following simple questions:
How close are the genetics of dogs and humans? How does this compare to cats and humans? What about mice and cats? How different, genetically, are mice and corn?
We have all the necessary genomic data to answer these questions, and we can calculate answers of a sort-- but the types of answers we can give at this point are rather sparse and definitely not intuitively satisfying.
Whenever we can ask simple questions about empirical phenomena that don't seem to have elegant answers, it's often a sign there's a niche for a new conceptual tool. This is a stab at a tool that I believe could deal with these questions more cogently and intelligently than current approaches.
Logarithmic Evolution Distance: an intuitive approach to quantifying difference between genomes.
How do we currently compare two genomes and put a figure on how close they are? The fashionable metrics seem to be:
- Raw % similarity in genetic code-- e.g., "Humans and dogs share 85% of their genetic sequence." Or 70%. Or 98%, depending on who you ask. However, what does this really say? It's a non-intuitive answer, particularly since there are so many ways to calculate the figure for this, depending on how one evaluates CNVs and functional parity in sequences. And this tends to grossly understate the importance of regulatory elements.
- Gene homologue analysis-- e.g., "The dog genome has gene homologues for ~99.8% of the human genome." However, neither the magnitude nor the functional meaning of the difference between two genomes having 99% homologous genes and 99.8% homologous genes is apparent. This approach also involves deep ambiguities in assuming homologue function, in assessing what constitutes a similar-enough homologue, and in dealing with CNVs-- and this 'roll up your sleeves and compare the functional nuts and bolts of two genomes' approach is also extremely labor-intensive.
- Time since evolutionary divergence-- e.g., "The latest common ancestor of dogs and cats lived 60 MYA, vs that of dogs and humans, which lived 95 MYA. However, though time seems a relatively good proxy for estimating how far apart two genomes are, there are many examples of false positives and false negatives for this heuristic. Selection strength and rate of genetic change can vary widely in different circumstances, and thus there are reasons to believe this heuristic is often deeply and systemically biased as a proxy for genome difference.
None of these approaches really give wrong answers to the questions I posed, but neither do they always, or often, give helpful and intuitive answers. They fail the ok, but what does it mean? test.
So here's my suggestion for a new approach-
'Evolution Distance' - a rough computational/simulation estimate (useful in a relative sense) of the average number of generations of artificial selection it would take to evolve organism X into organism Y under standardized conditions, given a set of allowed types of mutations.
To back up a bit, a (rough) way to explain what this idea is about is, take some cats. Breed them. Every generation, take the most genetically doglike cats, and breed them together. Eventually(!) you'll get a dog. What this tool does, essentially, is computationally estimate how many generations of selection [edit: mutation] it would take to go from genome A (a cat) to genome B (a dog). The number of generations is this 'evolution distance' between the genomes.
Now, what makes a dog a dog? We can identify several different thresholds for success-- an exact DNA match would be the gold standard, followed by a match of the DNA that codes for proteins, followed by estimated reproductive compatibility, followed by specific subsystem similarities, and so forth. The answer would be in terms of X to Y generations, 95% Confidence Interval, in log notation like the Ricter Scale, as it could vary so widely between organisms... call it LED for Logarithmic Evolution Distance. Arbitrarily, an LED of 1 might be 1k generations, an LED of 2 would be 10k generations, etc.
E.g., the LED of a babboon and a chimpanzee might be 1.8-1.9;
Of a giraffe and a hippo might be 3.4-3.6;
Starfish and a particular strain of e. coli might be 10.2-10.4. (That's a lot!)
I'm just throwing out some numbers, and may not be in the right ballpark... but the point is this is an intuitive, quantitative metric that can scale from comparing the genetics of parent and offspring all the way to comparing opposite branches of the tree of life.
This is intrinsically a quick and dirty estimate, very difficult to get (& prove) 'correct', but given that, it is
1. potentially very useful as a relative, quantitative metric,
2. intuitive in a way current measures of genetic similarity aren't,
3. fully computational with a relatively straightforward interpretation-- you'd set up a model, put in two genomes, and get an answer.
This estimate could, and would need to, operate with a significantly simplified model of selection. Later, the approach could slowly add in gene pools, simulation & function-aware aspects, mutation mechanics, the geometry of mutation hotspots, mutations incompatible with life, gene patterns that protect against mutations, HGT, etc. But it would start as, and be most helpful as, a very rough metric.
Variations:
1. Instead of being based on random mutations and pruning, perhaps the algorithm could be tuned to map out a shortest mutational path from genome A to genome B, given a certain amount of allowed mutation per generation. This would be less indicative of the randomness of evolution, but perhaps a tighter, more tractable, and more realistic estimate of the number of generations' worth of distance. [Note- I'm coming around to the idea that this is the better approach.]
2. Depending on the progress of tissue and functional domain gene expression analysis and what inherent and epistemological messiness lies therein, this could be applied to subsets of organisms: finding a provisional sort of evolution distance between organism X's immune system and organism Y's immune system, or limbs, or heart, etc. Much less conceptually elegant, but perhaps still useful.
Practical applications (why would this be useful?):
In general, I see this as an intuitive metric to compare any two genomes that could see wide use-- after the general model is built, the beauty of this approach is that it's automated and quantitative. Just input any two arbitrary genomes, input some mutational parameters, and you get an answer. Biology is coming into an embarrassment of riches in terms of sequencing genomes. This is a tool that can hopefully help people, both scientists and laymen, make better intuitive sense of all this data.
A specific use for this would be to compare the ratio of calculated LED to the time since evolutionary divergence while controlling for time between generations. This would presumably be a reasonable (and easy-to-do) measure to detect and compare strength of selection, perhaps helpful as a supplement to e.g., metrics such as linkage disequilibrium analysis. Alternatively, if the genome of two organisms' last common ancestor can be inferred, the LED of LCA's genome->genome A vs the LED of LCA's genome->genome B would presumably be an excellent quantitative indicator of relative strength of selection.
This metric is by no means limited to comparisons between species; comparing Great Danes to Pitbulls with this tool, or even two Pitbulls to each other, would generate interesting results.
This tool would also be helpful in an educational context, to drive home the point that everything living really is connected to everything else, and evolution is the web that connects them. It's also educational in the sense that it'd actually simulate a simplified form of genetic evolution, and we may learn a great deal from rolling up our sleeves and seeing how well our answers compare to nature's.
Open questions:
- This comparison as explained does not deal with the complexity of sexual recombination or of horizontal gene transfer (though to be fair, none of its competitors do either). Or, to dig a little deeper, evolution happens on gene pools, whereas this tool only treats evolution as mutation on single genomes. Does it still produce a usably unbiased result in most comparisons? (My intuition is if we're going for an absolute estimate of an 'evolution distance', no; a relative comparison, yes.)
- Would direction matter? It depends on how simple the model is, but realistically, it's very likely. E.g., the LED of a dog -> cat might be significantly different than cat -> dog. Presumably it'd matter the most in deep, structural changes such as prokaryote <-> eukaryote evolution. Loss of function/structure is always easier to evolve than function/structure.
- How realistically could one model the conditions that these evolutionary simulations would operate under? E.g., would the number of offspring need to be arbitrary for each simulation? Would the rate of mutation vary between dogs and cats? How could the model be responsive to operation under different ecosystems? How to deal with many changes in these quantities over time, if you're charting a large LED (e.g., bacteria->cat)? I guess the answer to this is, you could make things as complicated as you wanted. But you wouldn't have to.
- In theory, the impact of genetic differences between arbitrary members of the same species would be minimized by the logarithmic nature of the metric. Would this usually be the case? Presumably LED could be used to explore variation pertaining to this: e.g., species X has a mean LED of 1.4, whereas species Y has a mean LED of 1.6.
Anyway, this is a different way of looking at the differences between genomes. Not more or less correct than others-- but, at least in some cases, I think more helpful.
Edit, 9/27/09: Just read an important paper on the difficulty of reversing some types of molecular evolution, since neutral genetic drift accumulated after a shift in function may not be neutral in the original functional context. In the context of Logarithmic Evolution Distance, I think it underscores the point that LED can't be taken literally, since it doesn't take function or fitness into account. But then again, neither do the other tools it's competing against, and this doesn't impact its core function as an estimation-based tool with which to make relative comparisons.

