New Software: Data Mining

Posted by Dan on August 7, 2008 at 12:52 pm | Categories: Science, Software | 2 Comments

Scientific Software Some new software is in our Knowledge Discovery and Data Mining section. I can remember a time when “data mining” was a bit of an epithet in science (like “fishing expedition”), but now it has become an established way of finding links and connectivities in large data sets. Three new open source data mining programs appeared on our radar recently:

  • KNIME, pronounced [naim], is a modular data exploration platform that enables the user to visually create data flows (often referred to as pipelines), selectively execute some or all analysis steps, and later investigate the results through interactive views on data and models.
  • RapidMiner (formerly YALE) - not much detail is known about this package
  • Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.

Researching Open Science

Posted by Dan on July 31, 2008 at 2:41 pm | Categories: Science, open science | No Comments

I don’t know how I missed this before, but there’s a really interesting article from 2006 up at the Harvard Business School “Working Knowledge” site. It details some of Karim Lakhani’s results from a paper called ‘The Value of Openness in Scientific Problem Solving‘. The paper itself is actual detailed research on different methods of scientific problem solving that is really worth a read for anyone in the Open Science movement. They went looking to see if “Broadcast Search” (i.e. telling the world what problem you are working on) is an effective means of problem solving. My favorite part of the paper:

Our most counter-intuitive finding was the positive and significant impact of the self-assessed distance between the problem and the solver’s field of expertise on the probability of creating a winning solution. This finding implies that the farther the solvers assessed the problem as being from their own field of expertise, the more likely they were to create a winning submission. We reason that the significance of this effect may be due to the ability of “outsiders” from relatively distant fields to see problems with fresh eyes and apply solutions that are novel to the problem domain but well known and understood by them.

Cool Radiohead interactive video

Posted by Dan on July 14, 2008 at 11:41 pm | Categories: Fun | No Comments

So, I like Radiohead. A lot. Kid A has been in permanent rotation in my music collection for a couple of years now. But their new video for House of Cards is something else entirely. It was generated from 3-D data of Thom Yorke’s face collected via a Geometric Informatics scanning system which uses structured light to capture 3D images at close proximity. There’s an official video, but the best part is the completely interactive data viewer. Try it yourself!

Automated out-of-plane finder?

Posted by Dan on July 1, 2008 at 8:41 pm | Categories: Science, Software | No Comments

The code I’ve been working on has some cool features. If you give it a list of atoms and bonds, it automatically figures out bend and dihedral interactions using simple graph concepts. That is, if the molecule has a bond between atoms i and j and another bond between atoms j and k, you can easily deduce that there’s a bend interaction between i, j, and k. Similar three-bond ideas can be used to automatically determine dihedral interactions: Find bonds i-j, j-k, and k-l, then you can deduce the torsion for i-j-k-l.

For out-of-plane bends or improper torsions at the sp2 sites, there’s no simple graph theory way to determine an out-of-plane interaction. You actually need to know something about the chemical identity of the central atom. At least, I think this is the case. I’d love to be proven wrong, because keeping track of valences and bond counts is beyond the level of coding I wanted to include.

FooCamp? BarCamp?

Posted by Dan on June 30, 2008 at 9:10 am | Categories: Conferences, Science | No Comments

One of the more interesting aspects of the New Communication Channels workshop was something called the “SciBarCamp” that was organized by Jen Dodd. I’d never been at a meeting which used this format before, and I was a bit dubious when I first heard about it, but it worked well with the group that was at this meeting. Here’s how it functions:

  • After a morning of more traditional talks, everyone files in to a large room. Each participant gets a sheet of paper on which they write their name, and the name of a workshop that they are interested in leading.
  • Each of these sheets of paper gets tacked up to a board in the middle of the room, and people mill around looking at all of the proposed workshop titles. If you see a workshop that looks interesting, you vote for that workshop by bubbling in a circle on the sheet of paper.
  • The conference organizer can combine workshops if they look similar (in our case, a bunch of Wiki-related workshops were combined).
  • After about half an hour, the most popular workshops are selected and scheduled in particular rooms and time slots.
  • If your workshop was popular enough, you then have to lead it!
  • People can vote with their feet too; if a workshop is boring, you are encouraged to walk out and find one that isn’t (although in practice, few people actually did this).

Controversy was pretty much at a minimum because we were all converts to doing open science in one form or another (open source, open data, open access, open notebook). But we certainly got groups of people in each workshop who were guaranteed to be interested in the topic under discussion. After all, they’d voted for that workshop topic!

In order to make this work, you need a really good organizer to explain things up front. Scientists can be socially awkward and unwilling to try new formats, but this worked out well. I hope we start to see more of this kind of thing at smaller meetings.

Cool finds at the NCCB2008 workshop

Posted by Dan on June 27, 2008 at 12:25 pm | Categories: Policy, Science, Software | No Comments

Some of the cooler online resources that have been discussed at the NCCB2008 workshop:

New Communication Channels for Biology Workshop

Posted by Dan on June 25, 2008 at 9:20 am | Categories: Policy, Science | 3 Comments

I’m going to be giving a talk at the “New Communication Channels for Biology” Workshop run by the CalIT2 folks at UCSD. The workshop is Thursday and Friday, and there are going to be some interesting folks like Michael Nielsen, Hilary Spencer, Jean-Claude Bradley, Aaron Fulkerson, Michael Gribsikov, and a bunch more. It should be pretty interesting!

Dear Apple: open the iPhone!

Posted by Dan on July 27, 2007 at 1:19 pm | Categories: Uncategorized | 8 Comments

My Dream iPhone CalculatorMy new iPhone is just about the most amazing piece of technology I’ve ever used. There’s just one problem: I hate hate hate the calculator. Apple’s normal desktop calculator on OS X is remarkably functional. It has the standard scientific functions, a programmer’s mode, and best of all, there’s an RPN mode, which is essential for those of us who grew up with HP scientific calculators and therefore can’t figure why one would even want to use a “normal” calculator.

I guess I was expecting Apple to just port the desktop calculator to the iPhone. Instead, we’ve got a visually beautiful, but essentially useless, four function widget. I know it is a visual tribute to Dieter Rams’s design of 1970’s-era Braun desktop calculators, and I can appreciate simple and elegant design as much as any Apple fanboy. That’s what has me so annoyed; I love the look, but I need sin, cos, yx, sqrt, log, ex, DRG to RAD, and all the other functions of a scientific calculator.

I know there are a couple of good web-based scientific calculators designed for the iPhone: Belfry’s scicalc is one, iPhav has created MiniCalc, there’s a fake HP-35, and a pared-down barebones version, but as great as those are, the lag to load up a web app when using EDGE is just too long. Apple, if you want to make us nerds truly happy, keep your tribute to Dieter Rams when the iPhone is upright, and when someone rotates their phone while in the calculator, have it switch automagically to a HP-15C emulator. (The 15C had a lovely brushed metal finish, and your own engineers can tell you that the 15C was the best calculator ever made.)

There’s also news that the iPhone calculator app also has a relatively serious interface bug.

More importantly, my problems with the calculator wouldn’t be a big deal if third-party applications could run on the iPhone. (Legally, that is. Without jailbreak and iPhoneInterface, and plist editing and assembling a toolchain.) I’d happily start writing a replacement calculator myself if the SDK were available. So Apple, while you guys have done some truly astonishing engineering, can I humbly request that you open the iPhone to third party developers? Please?

Cool new software

Posted by Dan on June 28, 2007 at 8:18 am | Categories: Science, Software | 2 Comments

Scientific Software A whole bunch of new software to highlight today:

In our Engineering section, we have two new packages: OOFEM is an object oriented, parallel, multiphysics finite element code system for solving mechanical, transport and fluid mechanics problems, and ASCEND is a generalized modelling environment for engineering and science problems. It offers: an object-oriented model description language for describing your system, an interactive user interface that allows you to solve your model and explore the effect of changing the model parameters, and a scripting environment that allows you to automate your more complex simulation problems.

Our Molecule Viewers and Editors section sees the addition of Avogadro, a new molecular editor built on OpenBabel that looks like it will be great (although I can’t find a download link to try it out).

I’m really excited to see CP2K show up in our Theoretical & Computational Chemistry section. CP2K is performs atomistic and molecular simulations of solid state, liquid, molecular and biological systems. It provides a general framework for different methods such as e.g. density functional theory (DFT) using a mixed Gaussian and plane waves approach (GPW), and classical pair and many-body potentials.

PhyloSort is a neat Java code that sorts phylogenetic trees by searching for user-specified subtrees that contain a monophyletic group of interest defined by operational taxonomic units. Look for it in our Bioinformatics section.

And the newest link is to massXpert, a follow on package by the author of polyXmass. massXpert simulates and analyzes mass spectrometry data obtained on linear (bio-)polymers. Both massXpert and polyXmass can be found in our Analytical Chemistry section.

Check them out and keep those suggestions coming!

Technorati Tags: , ,

Heat Capacity of Water

Posted by Dan on June 27, 2007 at 2:48 pm | Categories: Fun, Science, education | 1 Comment

Heating a Water Balloon It is no secret to my students, family and friends that I’m now completely obsessed by the odd properties of water, including the anomalously high heat capacity. Here’s a neat parlor trick involving this water anomaly that is masterfully explained by Robert Krampf in one of his many great Science Experiment videos: Heating a Balloon.

Next Page »

Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds. Valid XHTML and CSS. ^Top^