Biological Databases and Distributed Computing (140.637)
2nd term, 2009, MWF 1:30-2:20, W2009
Instructors: F. Pineda, M. Ochs
Description
Students will be introduced to the principles and skills required to represent knowledge and to develop biological and biomedical databases. They will learn the principles behind semantic web technologies and how these can spur data integration in biomedical research.
The course will begin with an introduction to the semantic web and the ontologies and ontological reasoning that underlie its function. Students will learn about ontology creation and use, especially the use of mature medical ontologies and emerging biological ontologies for encoding data within databases. Students will principles and practice of medical and biological database design using MYSQL. Topics to be covered include SQL, database design, normalization, optimization and ER modeling as well as database interoperability. This will lead naturally to a discussion of data federation within and across institutions, including national efforts led by the NLM and NCI. A final project is required.
Intended Audience
The intended audience is students enrolled in the
MHS in Bioinformatics program as well as students with a need to process, exchange or develop large complex datasets that represents knowledge in a biomedical domain.
Course Resources
People
Prerequisites
140.636 or permission of instructor.
Students must be comfortable in unix or linux and have previous programming experience in Perl.
Homework and Grading Policy
Grades are based on four programming assignments and a final project. The programming assignments count for 70% of the grade, the final project counts for 30% of the grade. It is expected that each student will coordinate with the instructor to select a suitable project based on the students interests. The project should be chosen no later than mid-way through the course.
For programming assignments students may discuss ideas and approaches with others. However, programs and projects are to be completed independently and should be original work. Programming assignments will generally consist of web-based assignments that can be viewed by the instructors and anyone behind the JHSPH firewall.
Texts
Schedule and Syllabus