{ "metadata": {}, "nbformat": 4, "nbformat_minor": 5, "cells": [ { "id": "metadata", "cell_type": "markdown", "source": "
\n\n# Python - Basic Types & Type Conversion\n\nby [The Carpentries](https://training.galaxyproject.org/hall-of-fame/carpentries/), [Helena Rasche](https://training.galaxyproject.org/hall-of-fame/hexylena/), [Donny Vrins](https://training.galaxyproject.org/hall-of-fame/dirowa/), [Bazante Sanders](https://training.galaxyproject.org/hall-of-fame/bazante1/)\n\nCC-BY licensed content from the [Galaxy Training Network](https://training.galaxyproject.org/)\n\n**Objectives**\n\n- What kinds of data do programs store?\n- How can I convert one type to another?\n\n**Objectives**\n\n- Explain key differences between integers and floating point numbers.\n- Explain key differences between numbers and character strings.\n- Use built-in functions to convert between integers, floating point numbers, and strings.\n\n**Time Estimation: 30M**\n
\n", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-0", "source": "

Python is a typed language, data has a type, and different types of data cannot always be connected immediately and might need some conversion step before they can be used together. For instance if you add a number to a number, what should happen? If you add a number to a message, what do you expect will happen?

\n
\n
Agenda
\n

In this tutorial, we will cover:

\n
    \n
  1. Types
  2. \n
\n
\n

Types

\n

Every value in a program has a specific type.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
NamePython CodeRepresents
Integerintrepresents positive or negative whole numbers like 3 or -512.
Floating point numberfloatrepresents real numbers like 3.14159 or -2.5.
Character stringstrtext, written with either ' or \" quotes (they must match)
\n

Checking the Type

\n

Use the built-in function type to find out what type a value has. This works on values as well as variables. But remember: the value has the type — the variable is just a label.

\n

Check the type of values with the type() function:

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-1", "source": [ "print(type(52))\n", "print(type(3.14159))" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-2", "source": "

You can also check the types of variables

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-3", "source": [ "fitness = 'average'\n", "print(type(fitness))" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-4", "source": "

Methods

\n

A value’s type determines what the program can do to it. Some operations may work

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-5", "source": [ "print(5 - 3)" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-6", "source": "

And some operations may not work:

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-7", "source": [ "print('hello' - 'h')" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-8", "source": "

For instance, you can use the + and * operators on strings.

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-9", "source": [ "full_name = 'Ahmed' + ' ' + 'Walsh'\n", "print(full_name)\n", "separator = '=' * 10\n", "print(separator)" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-10", "source": "

Some methods only accept specific types, or only work on specific types.

\n

The built-in function len returns the length of your data. Which of the following would you expect to work? len(string)? len(int)?

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-11", "source": [ "print(len(full_name))\n", "print(len(52))" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-12", "source": "

Matching Types

\n

Not all types support all operations, adding an integer to a string doesn’t make much sense:

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-13", "source": [ "print(1 + '2')" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-14", "source": "

This does not work because it’s ambiguous: should 1 + '2' be 3 (a number) or '12' (a string)? Some types can be converted to other types by using the type name as a function.

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-15", "source": [ "print(1 + int('2'))\n", "print(str(1) + '2')" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-16", "source": "

Operation Support

\n

Here is a quick chart showing which operations are allowed for each pair:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
Left\\Rightintfloatstr
int+-*/+-*/*
float+-*/+-*/
str*+
\n

As you can see you can do 3 * \"test\" and \"test\" * 3, but it doesn’t work with floats.

\n

Can mix integers and floats freely in operations.

\n

Integers and floating-point numbers can be mixed in arithmetic. Python 3 (which we use) automatically converts integers to floats as needed.

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-17", "source": [ "print(f'half is {1 / 2.0}')\n", "print(f'three squared is {3.0 ** 2}')" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-18", "source": "

Variables only change value when something is assigned to them.

\n

If we make one cell in a spreadsheet depend on another, and update the latter,\nthe former updates automatically. However, this does not happen in programming languages.

\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-19", "source": [ "variable_one = 1\n", "variable_two = 5 * variable_one\n", "variable_one = 2\n", "print(f'first is {variable_one} and second is {variable_two}')" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-20", "source": "

The computer reads the value of first when doing the multiplication, creates\na new value, and assigns it to second. After that, second does not remember\nwhere it came from. Every computation happens line-by-line.

\n
\n
Question: Fractions
\n

What type of value is 3.14159?\nHow can you find out?

\n
👁 View solution\n
\n

It is a floating-point number (often abbreviated “float”).\nIt is possible to find out by using the built-in function type().

\n
print(type(3.14159))\n<class 'float'>\n
\n
\n
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-21", "source": [ "# Test out solutions here!\n", "" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-22", "source": "
\n
Question: Automatic Type Conversion
\n

What type of value is the result of (3.25 + 4)?

\n
👁 View solution\n
\n

It is a float:\nintegers are automatically converted to floats as necessary.

\n
result = 3.25 + 4\nprint(f'result is {type(result)}')\n
\n
7.25 is <class 'float'>\n
\n
\n
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-23", "source": [ "# Test out solutions here!\n", "" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-24", "source": "
\n
Question: Choose a Type
\n

What type of value (integer, floating point number, or character string)\nwould you use to represent each of the following? Try to come up with more than one good answer for each problem. For example, in # 1, when would counting days with a floating point variable make more sense than using an integer?

\n
    \n
  1. Number of days since the start of the year.
  2. \n
  3. Time elapsed from the start of the year until now in days.
  4. \n
  5. Serial number of a piece of lab equipment.
  6. \n
  7. A lab specimen’s age
  8. \n
  9. Current population of a city.
  10. \n
  11. Average population of a city over time.
  12. \n
\n
👁 View solution\n
\n

The answers to the questions are:

\n
    \n
  1. Integer, since the number of days would lie between 1 and 365.
  2. \n
  3. Floating point, since fractional days are required
  4. \n
  5. Character string if serial number contains letters and numbers, otherwise integer if the serial number consists only of numerals
  6. \n
  7. This will vary! How do you define a specimen’s age? whole days since collection (integer)? date and time (string)?
  8. \n
  9. Choose floating point to represent population as large aggregates (eg millions), or integer to represent population in units of individuals.
  10. \n
  11. Floating point number, since an average is likely to have a fractional part.
  12. \n
\n
\n
\n
\n
Question: Division Types
\n

In Python 3, the // operator performs integer (whole-number) floor division, the / operator performs floating-point\ndivision, and the % (or modulo) operator calculates and returns the remainder from integer division:

\n
print(f'5 // 3: {5 // 3}')\nprint(f'5 / 3: {5 / 3}')\nprint(f'5 % 3: {5 % 3}')\n
\n
5 // 3: 1\n5 / 3: 1.6666666666666667\n5 % 3: 2\n
\n

If num_subjects is the number of subjects taking part in a study,\nand num_per_survey is the number that can take part in a single survey,\nwrite an expression that calculates the number of surveys needed\nto reach everyone once.

\n
👁 View solution\n
\n

We want the minimum number of surveys that reaches everyone once, which is\nthe rounded up value of num_subjects/ num_per_survey. This is\nequivalent to performing a floor division with // and adding 1. Before\nthe division we need to subtract 1 from the number of subjects to deal with\nthe case where num_subjects is evenly divisible by num_per_survey.

\n
num_subjects = 600\nnum_per_survey = 42\nnum_surveys = (num_subjects - 1) // num_per_survey + 1\n\nprint(num_subjects, 'subjects,', num_per_survey, 'per survey:', num_surveys)\n
\n
600 subjects, 42 per survey: 15\n
\n
\n\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-25", "source": [ "# Test out solutions here!" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-26", "source": "
\n
Question: Strings to Numbers
\n

Where reasonable, float() will convert a string to a floating point number,\nand int() will convert a floating point number to an integer:

\n
print(\"string to float:\", float(\"3.4\"))\nprint(\"float to int:\", int(3.4))\n
\n
string to float: 3.4\nfloat to int: 3\n
\n

If the conversion doesn’t make sense, however, an error message will occur.

\n
\n
Code In: Python
\n
print(\"string to float:\", float(\"Hello world!\"))\n
\n
\n
\n
Code Out
\n
Traceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\nValueError: could not convert string to float: 'Hello world!'\n
\n
\n

Given this information, what do you expect the following program to do?

\n

What does it actually do?

\n

Why do you think it does that?

\n
print(\"fractional string to int:\", int(\"3.4\"))\n
\n
👁 View solution\n
\n

What do you expect this program to do? It would not be so unreasonable to expect the Python 3 int command to\nconvert the string “3.4” to 3.4 and an additional type conversion to 3. After all, Python 3 performs a lot of other\nmagic - isn’t that part of its charm?

\n
int(\"3.4\")\n
\n

However, Python 3 throws an error.

\n
Traceback (most recent call last):\n  File \"<stdin>\", line 1, in <module>\nValueError: invalid literal for int() with base 10: '3.4'\n
\n

Why? To be consistent, possibly. If you ask Python to perform two consecutive\ntypecasts, you must convert it explicitly in code.

\n
int(float(\"3.4\"))\n
\n
3\n
\n
\n
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-27", "source": [ "# Test out solutions here!" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-28", "source": "
\n
Question: Arithmetic with Different Types
\n

Which of the following will return the floating point number 2.0?\nNote: there may be more than one right answer.

\n
first = 1.0\nsecond = \"1\"\nthird = \"1.1\"\n
\n
    \n
  1. first + float(second)
  2. \n
  3. float(second) + float(third)
  4. \n
  5. first + int(third)
  6. \n
  7. first + int(float(third))
  8. \n
  9. int(first) + int(float(third))
  10. \n
  11. 2.0 * second
  12. \n
\n
👁 View solution\n
\n

Answer: 1 and 4 give exactly 2.0.\nAnswer 5 gives the value 2 which may be considered equivalent, but is not returning a float specifically.

\n
\n
\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "id": "cell-29", "source": [ "# Test out solutions here!" ], "cell_type": "code", "execution_count": null, "outputs": [], "metadata": { "attributes": { "classes": [ "> " ], "id": "" } } }, { "id": "cell-30", "source": "\n", "cell_type": "markdown", "metadata": { "editable": false, "collapsed": false } }, { "cell_type": "markdown", "id": "final-ending-cell", "metadata": { "editable": false, "collapsed": false }, "source": [ "# Key Points\n\n", "- Every value has a type.\n", "- Use the built-in function `type` to find the type of a value.\n", "- Types control what operations can be done on values.\n", "- Strings can be added and multiplied.\n", "- Strings have a length (but numbers don't).\n", "- Must convert numbers to strings or vice versa when operating on them.\n", "- Can mix integers and floats freely in operations.\n", "- Variables only change value when something is assigned to them.\n", "\n# Congratulations on successfully completing this tutorial!\n\n", "Please [fill out the feedback on the GTN website](https://training.galaxyproject.org/training-material/topics/data-science/tutorials/python-types/tutorial.html#feedback) and check there for further resources!\n" ] } ] }