Personal tools
You are here: Home DNA Music Tools Read Me File from the Package

Read Me File from the Package

This is the file readme.txt from the dna_composition.tgz tarball.

The readme.txt file from the tarball::

	DNA Composition Software
	(CC by-sa) 2007
	R. Mark Adams, Ph.D.

	INTRODUCTION
	------------------------------------------------------------

	In this folder you will find a set of tools in the python programming
	language for translating DNA sequences into music.  This software is
	under heavy development, and consequently is very rough.  It is usable,
	though, and produces interesting (if limited) music from pretty much
	any DNA sequence it is given.  It works best with sequences of a few
	hundred to a few thousand base pairs- try cDNA sequences from GenBank
	(see http://www.ncbi.nlm.nih.gov/entrez/) or similar length non-coding
	regions for interesting results.  

	The basic classes are in the Melody.py and Chords.py files, and both
	are also pretty rough- the Chords are particularly limited to a narrow
	subset of possible chords, but I am working on it!  I based the MIDI
	side of the work on the extraordinary PythonMIDI library by mxm (see
	http://www.mxm.dk/products/public/pythonmidi for details.) 

	From the documentation in aaclass_composition.py:

	# Composition based on amino acid classes from translated DNA sequence
	# data.  It is based on the concept that amino acids fall into a well-
	# defined set of chemical classes (see Adams, Das, Smith, (1996)
	# Protein Science for more details.)

	# Classes:
	# (From the pima (Smith and Smith, 1992) man page)
	#
	#
	#      Original Amino Acid Class Hierarchy Alphabet (Class1  alpha-
	#      bet):


	#                        Amino Acid Classes                     Match score

	#                                                                   -2
	#                 _______________ X __________________               0
	#                /          /           \             \
	#             _ f _        /       ______r _______     \             1
	#           /  /    \     /       /   /     \     \     \
	#          /  c      \   e       /   m       p     \   _ j __        2
	#         /  /  \     \ / \     /   / \     / \     \ /   \  \
	#        /  a    b     d   \   /   l   k   o   n     i     h  \      3
	#       /  / \  / \   /|\   \ /   / \ / \ / \  /\   / \   / \  \
	#      C   I V  L M  F W Y   H   N   D   E  Q  K R  S T   A G   P    5


	#      For both alphabets, gaps are denoted by "g"s.

	# For the purposes of this composition, I will use three classes:
	#
	# (C,I,V,L,M,F,W,Y) - nonpolar
	# (H,N,D,E,Q,K,R,S,T) - polar
	# (A,G,P) - small
	#
	# Each class will be assocaited with chord transition rules, based on the
	# chord transition maps developed by Steve Mugglin (see:
	# http://chordmaps.com/part3.htm)  When translated into a python dictionary:
	#
	# {'ii':('iii','V','ii'),'iii':('vi','IV','iii'),'IV':('ii','V','I','IV'),\
	#    'V':('I','V'),'vi':('IV','ii','I'),'I':('ii','V','iii','vi','IV','V','I')}
	#
	# This is easily translated into the 'C' key, which I am using for these
	# experiments (thanks, emacs replace-regexp!):
	# 
	# {'Dm':('Em','G','Dm'),'Em':('Am','F','Em'),'F':('Dm','G','C','F'),\
	#    'G':('C','G'),'Am':('F','Dm','C'),'C':('Dm','G','Em','Am','F','G','C')}
	#
	# So:  The algorithm is:
	#
	# (I)   Each base gets a middle 'C' eighth-note hit  <done>
	# (II)  For each attempted codon translation:
	#         (a) For a codon not in the open reading frame, no additional notes
	#         (b) For a codon in a reading frame:
	#             (0) If it is a stop codon, make the current chord 'I' ('C')
	#             (1) If it is not in the current biochemical class:
	#                 Make an allowed chord the current chord, based on the
	#                    codon triplet- add their values as below, mod on
	#                    the length of the set of allowed transitions
	#                 Make the current melody note the base of the current chord
	#             (2) If it is the same biochemical class:
	#                 Stay in the current chord and make that the current chord
	# (III) For the bases in the codon triplet:
	#         (a) For each of the three bases:
	#             (1) If it is an 'A': make the current melody note a note higher
	#                 in the current chord ......A=+1
	#             (2) If it is a 'T': make the current note a note lower in the
	#                 current chord        ......T=-1
	#             (3) If it is a 'G': make the current note the same in the current
	#                 chord                ......G=0
	#             (4) if it is a 'C': add a rest- add no new note this codon
	#                                      ......C=0

	INSTALLATION
	------------------------------------------------------------

	You will need a working installation of python (you can get it from
	http://www.python.org.)  I have tested it on version 2.3 on Linux, Mac
	OSX and Windows XP with no problems.  

	Expand the contents of the dna_composition.tgz tarball somewhere
	convenient.

	You will also need to grab PythonMIDI from mxm's site:
	http://www.mxm.dk/products/public/pythonmidi/download/midi.0.1.1.tar.gz

	You will need at minimum MidiOutFile.py, constants.py, MidiOutStream.py,
	DataTypeConverters.py, and RawOutstreamFile.py from that tarball.
	Either expand it and put it somewhere in your pythonpath or just put
	the files you need in the DNA_Composion directory (from dna_composition.tgz)
	and you should be all set.

	RUNNING
	------------------------------------------------------------
	Running the software is easy- just type:

	python aaclass_composition.py <input_file>

	Where <input_file> is a file containing a DNA sequence.  

	The sequence data can be pretty much any format, as the software
	ignores anything that is not in [ATGCatcg]. This can lead to problems
	if the data you have contains names, or other nanotation data with
	those characters in it.  I have included a couple of example files,
	test.dna (non-coding data from ancient viral DNA integrated into the
	human genome) and cDNA_test.dna (the coding region for alcohol
	dehydrogenase) for you to try out.  Note that I just cut the DNA
	sequences out of their respective GenBank entries at
	http://www.ncbi.nlm.nih.gov/entrez .

	You will see a whole bunch of information zip by on the screen.  If
	you are on a slow computer, you can comment out the "print" calls in
	the aaclass_composition.py file.  Someday I will put in the
	appropriate "debug" flags in, etc.  Right after I finish the Chord
	object... :-)

	Three files will result:

	<input_file>_dna_base_track.mid
	<input_file>_dna_melody_track.mid
	<input_file>_protein_chord.mid

	They can be played directly with your MIDI instrument or combined via
	a sequencer for even more fun.

	BUGS, THOUGHTS and TODOs
	------------------------------------------------------------
	The software should run as advertised if properly set up.  I have not
	tested it as thoroughly as I should, and so I am sure that there are
	all kinds of input that will give it fits, or at least give
	unpredictable results.  YMMV.

	As mentioned, the Chord object needs lots of work.  I am still porting
	big sections of the original C++ code to python, and so it is only
	marginally functional- most of the interesting chord progressions
	cannot be added at this point.  I also want to add more structure and
	robustness so that it can be used more interactively or in other
	contexts.  The music that comes out from the sequences is neither as
	varied or as useful as "auralization" as I would like, so this is
	merely the beginning. 

	I didn't want to hold onto the code forever, though, so out it goes.
	As mentioned above, it is Creative Commons Attribution/Share and share
	alike, so please go ahead and do anything you want with it- if you are
	kind you will drop me a note or a pointer to the cool stuff you have
	built with it.  If you find bugs, or have features you would think are
	useful, please pass them along!  If you want to fix them, so much the
	better!  I will incorporate the changes into the codebase happily.

	Happy DNAing- hope you have as much fun as I did exploring the genome.

	-Mark
	rmadams@epotential.com

Read the source code for more details.
Document Actions