Skip to content

FoundCompBio-Spr24/IntroToPython_Week5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 

Repository files navigation

Introduction to Python

Reading

Computing Skills for Biologists, Chapter 3 (Basic Programming), pgs. 81-119

What is Python?

  • Interpreted, high-level language
  • Very popular among computational biologists
  • One of the most "human readable" programming languages
  • One of the few languages where line indentation is actually interpreted as part of the code
  • Python is much less picky than bash about spacing within lines, though
  • Dynamically typed (don't worry, we'll get to this later)

Three Ways to Run Python

Python Interpreter

  • Unlike many programming languages (e.g., C++ and Java), Python can be executed interactively.
  • To start the Python interpreter, simply open up a Terminal window and type python.
  • Once the Python prompt (>>>) appears, type:
    • a = 3
    • print(a)
  • What do you see?
  • To exit the Python interpreter, type quit().

Python Scripts

  • Now, put those same commands in a text file ending with a .py extension (e.g., test.py).
  • You can use this line at the top of the file to indicate that this is a Python script:
    • #! /bin/python3
    • To run this, add execute permissions to the file, then use ./test.py
  • Alternatively, you can type python3 test.py without the header line.
  • What do you see?
  • The .py extension isn't required, but is conventional for Python code.

Jupyter Notebooks

  • Jupyter notebooks are a special type of file that allows formatted notes to be mixed with Python code.
  • These notebooks can be run on smic through OnDemand.
  • To open a Jupyter notebook, go to the "Interactive Apps" menu at the top and select "Jupyter Notebook".

Basic Python Data Types

  • Numbers

    • Note that, unlike bash, Python can handle both integers and floating point (decimal) numbers.
    • Integers - myInt = 10
    • Floating point numbers (decimals) - myFloat = 3.21312
    • If running Python interactively, variable values can be seen just by typing the name of the variable - myFloat.
    • If running Python in a script, variable values can be printed by using the print() funtion - print(myFloat).
    • Changing numbers from one type to another is one form of "type casting".
  • Strings - myStr = "This is a string."

    • Strings can be of any length, but are always defined with quotes.
    • Individual characters, or subsets of characters, can be accessed using square brackets - []
      • String indices (and all indices in Python) start at 0
      • myStr = "biology"
      • myStr[0]
      • A range of indices can be defined with a colon
      • myStr[2:5]
    • Strings can be concatenated together with the + operator.
      • myStr = "biology"
      • newString = myStr + "_is_super_interesting"
      • newString
    • It is very easy in Python to use for loops with strings. Try this:
for letter in myStr:
    print(letter)
  • Booleans - True or False
    • These variables can take the values True or False only.
      • myBool = True
    • These are the data types used in things like if...else statements and while loops.
    • Logical statements, like myNum > 3 or myStr == "test", return a boolean variable indicating the result of the comparison
    • In Python, the value of a Boolean can be reversed by preceding it with not.

Functions

  • Functions are the verbs of a programming language
  • Functions take some variables or objects as input (arguments) and either do something with them or return a new variable/object.
  • Functions are always written like this - myFunction(object1,object2,...)
  • They always have parentheses, even if they don't need any arguments to be passed to them
    • For example, myFunction()
  • The number and type of arguments needed is specific to each function
  • Perhaps the simplest function to use is print(). print() can take an arbitrary number of variables as arguments and will print them to the screen.
  • Here are a few other really useful functions and what they do:
    • len(string) - Takes one argument (a string) and returns its length
    • abs(number) - Takes one argument (either an integer or a float) and returns its absolute value
    • round(float) - Takes one floating point number as an argument and rounds it to the nearest integer
    • type(variable) - Takes one variable as an argument and returns its type

Methods

  • Methods are functions (sort of like actions or verbs) that belong to a variable
  • Methods are accessed and used by appending them to the name of a variable, separated by a .
  • For instance, let's say we define a string - myStr = "biOLogy ". Here are some really useful methods that can be used with that string:
    • myStr.upper() - Converts all characters in the string to uppercase
    • myStr.lower() - Converts all characters in the string to lowercase
    • myStr.strip() - Removes whitespace from the beginning or ending of the string
    • Methods can also be chained! - myStr.strip().lower()
    • myStr.replace("y","ical") - This method functions like find and replace, where the first argument is the text to find and the second argument is the text to use when replacing.
    • myStr.split("O") - Breaks up the string, using the character provided as an argument as the delimiter. This method then returns a list of new strings. We'll talk more about lists later.

Comments in Python

  • Just like with bash, comments are essential in Python.
  • Comments are included in Python by putting a # at the beginning of a line.
  • Many text editors have a shortcut keystroke to toggle lines between being commented and uncommented.
  • # This is an example of a comment

String Formatting

  • Python has a special way to format strings so that the values of variables can be included
  • The operator % is used to indicate that the values of certain variables should be used to fill in a string
  • The parts of the string where the values should go is indicated with placeholders (kinda like wildcards)
  • The three most basic placeholders are:
    • %d - An integer
    • %f - A floating point number (i.e., a decimal)
    • %s - A string
  • Try this:
    • faveInt = 7
    • faveFloat = 3.14
    • faveString = phyleaux
    • print("Favorite integer is %d, favorite float is %f, and favorite string is %s." % (faveInt,faveFloat,faveString))
  • Note how the variables holding the values are provided in the same order inside parentheses after the %.

User Input

  • Python has a special function, input("Message:" ), to accept input from a user.
  • This function reads the user input and returns it. But be sure to store it in a variable!
    • userStr = input("Input some string: ")
  • You can use the string methods outlined above to standardize user input - like stripping out excess whitespace, changing character case, etc.

Getting help from Python

  • If you need help remembering what methods are available for a given object, use dir(<OBJECT>)
  • Note that methods starting and ending with "__" are not meant for users to call. They are methods to be used by the system.
  • Try creating a string variable (e.g., myStr = "some text").
    • Now run dir(myStr)
    • Pick a method you haven't used before and see what it does with myStr.

if statements

  • These are similar in form to bash, but simpler.
  • The general form is:
if <CONDITION>:
    <DO_THIS>
  • The if statement has to finish with a :.
  • Also, in Python, the code block inside the if statement must be indented. You can use either tabs or a certain number of spaces. When using spaces, you must use the same number of spaces throughout your code. The official recommendations for Python style strongly suggest using 4 spaces, rather than tabs.
  • Note that the syntax for conditions is pretty intuitive in Python
    • myStr != "other text"
  • if statements can be used on their own, but no code will be executed when the condition is false.

if...else statements

  • To include code blocks that should be executed when the condition is false, you'll need to use an if...else statement. These have the general form:
if <CONDITION>:
    <DO_THING_WHEN_TRUE>
else:
    <DO_THING_WHEN_FALSE>
  • Sometimes, you'll want to test a series of conditions to decide what to do. In that case you could use a nested set of if...else statements:
nuc = "A"
if nuc == "A":
    print("Nucleotide is an A.")
else:
    if nuc == "C":
        print("Nucleotide is a C.")
    else:
        if nuc == "G":
            print("Nucleotide is a G.")
        else:
            if nuc == "T":
                print("Nucleotide is a T.")
  • However, you'll notice that this can get pretty cumbersome pretty quickly. Thankfully, python offers another, more compact, way to write this:
nuc = "A"
if nuc == "A":
    print("Nucleotide is an A.")
elif nuc == "C":
    print("Nucleotide is a C.")
elif nuc == "G":
    print("Nucleotide is a G.")
elif nuc == "T"
    print("Nucleotide is a T.")
  • The elif statements stand for else...if and capture what we tried to do above with the nesting.
  • It is important to realize that as we've currently written these statements, nothing would happen if the variable nuc didn't have the value A, C, G, or T. This can be a big problem when accepting input from a user or file that might provide a nonsensical value. To capture these cases, it's always best to close with a simple else:
nuc = "A"
if nuc == "A":
    print("Nucleotide is an A.")
elif nuc == "C":
    print("Nucleotide is a C.")
elif nuc == "G":
    print("Nucleotide is a G.")
elif nuc == "T":
    print("Nucleotide is a T.")
else:
    print("This is not a valid nucleotide!")
In-Class Practice Exercise (Characterizing Sequences)

(1) Copy the sequences available in this repository.

(2) Write a Python script with code to answer these questions.
For each question, include comments describing how you're
addressing the question and have the script print out informative
messages with answers to each question.

- How long are the sequences?
- Are the sequences the same length?
- Convert the DNA sequences to equivalent RNA sequences (T -> U).
- What are the frequencies of the four nucleotides in each sequence?
- At how many positions do the sequences differ?
- Generate the reverse complement of each sequence.

Additional Python Resources

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published