Home Tags Contact

Python and Microsoft Access Files

— Don Parakin — python

Here’s how I used Python to read and update data in a Microsoft Access file.

Recently I’ve been doing some pro bono work for a non-profit. The plan is a multi-step roadmap to modernize their ancient IT systems. For the next step, some of the new stuff has to co-exist with some of the old stuff. The old stuff includes a few old Microsoft Access databases. One of these was created around 1997 and so is quite old.

If you are using newer Microsoft Access files, keep reading. The techniques are almost identical (although I have not tried them on newer Access database files).

Needs

Around 2007, this non-profit needed to share some of its data with a 3rd party. Back then they decided to transfer this data in a MS-Access “.mde” file. Fast forward to 2020, a new app would now be the source of several fields of the data.

Eventually we would work with the 3rd party to migrate from .mde to a modern file format. But we couldn’t do that in the time we had available for the current step of the roadmap. So, for the short term, we would need to somehow update the ancient “.mde” file with data from the new app.

Before continuing, a few words about MS-Access versions and file extensions. “.mdb” was the file extension up to MS-Access 2003. “.accdb”, a completely new file format, was introduced by MS-Access 2007 (which still supported the older “.mdb” format). The old “.mdb” format evolved over its lifetime including a jump from 97 to 2000 that was not backward compatible. Fortunately for me, my circa 2007 file is post-2000 otherwise I might have been out of luck.

What is a “.mde” (or “.accde” or “.ade”) file, ending in “e”? It is a compiled instance of a “.mdb” (or “.accdb”) file. It prevents viewing or modifying the code, forms, and reports contained within. My old “.mde” file was not going to let me see its code. Fortunately for me, I didn’t need to.

Candidates

For the modernized systems, we want to use Python wherever possible. It is a very popular language, known by many, easy to learn by others, and usable in many domains (scripts, web dev, desktop tools). So, the search was limited to accessing MS-Access from Python.

For accessing the “.mde” database, something ODBC based was a likely solution. ODBC is a technology developed by Microsoft for accessing data stores. MS-Access is a data store from Microsoft. Soooo ODBC was a likely candidate.

A quick search found the pyodbc module, its documentation, and a few useful hits: link, link.

pypyodbc is another option. It’s pure Python (no compiled C code) and has a few features pyodbc doesn’t have. pyodbc is older and is thanked by the author of pypyodbc who said he used it for years. pypyodbc’s documentation was not quite as good so I decided to go with pyodbc (without trying pypyodbc). Try pypyodbc if pyodbc just ain’t working out for you. I selected pyodbc for its maturity over features I didn’t need.

Solution

I will assume you know a little Python, how to install a Python package (or will read the docs to learn), and that you need to import pyodbc for the code snippets below to work. I won’t pad this blog post with all that.

Get Connected

First, we need to connect to our MS-Access database. To connect, you’ll need to select the appropriate ODBC driver available on your Windows box. Windows comes with many ODBC drivers pre-installed. You’ll have a different set of drivers when running in Windows 32-bit and 64-bit. You’ll also (probably) have drivers for several database types, not just MS-Access.

pyodbc can display the available ODBC drivers. When running with 32-bit Python, pyodbc can find and show the 32-bit ODBC drivers. When running with 64-bit Python, pyodbc can find and show the 64-bit drivers.

In my case, I needed the 32-bit Microsoft Access ODBC driver to access my ancient 32-bit MS-Access file. The 32-bit ODBC drivers require that I run a 32-bit version of Python. Your case may be different, of course.

To see a list of available drivers, run this Python code:

msa_drivers = [x for x in pyodbc.drivers() if 'ACCESS' in x.upper()]
print(f'MS-Access drivers: \n{msa_drivers}')

From the list, I found the driver I needed. I hard-coded this (I know, me bad) in the function I used to get a connection to my MS-Access database. Here’s the function (which will be called in all the examples below):

def get_dbconn(file, password=None):
    pyodbc.pooling = False
    driver = '{Microsoft Access Driver (*.mdb)}'
    dbdsn = f'Driver={driver};Dbq={file};'
    if password:
        dbdsn += f'Pwd={password};'
    return pyodbc.connect(dbdsn)

pydoc provides some useful tips and tricks for specific database platform.

Okay, now that we can connect, let’s do some interesting stuff

Get Table Names

Of course, there was no documentation for the ancient MS-Access database. I needed to find out what was in it. I wasn’t willing to pay for a license for MS-Access just to reverse engine the schema so I needed something else. Fortunately, pyodbc is able to provide this info, starting with a list of the names of all the tables.

dbconn = get_dbconn('mydata.mde', 'a-secret-pw')
cursor = dbconn.cursor()
for table_info in cursor.tables(tableType='TABLE'):
    print(f'Table: {table_info.table_name}')

Get Column Names

Now that I’ve got the name of the table, I need to see what columns are in the table:

tablename = 'Membership Information'
print(f'---- Table: {tablename}')
cursor = dbconn.cursor()
cursor.execute(f'select * from "{tablename}"')
for col in cursor.description:
    print(f'\tcol: {col}')

This gave me the column definition details I needed including name, data type, precision (such as string lengths), etc.

col: ('NUMBER', <class 'float'>, None, 53, 53, 0, True)
col: ('EXPIRY', <class 'datetime.datetime'>, None, 19, 19, 0, True)
col: ('FIRST NAME', <class 'str'>, None, 50, 50, 0, True)
col: ('LAST NAME', <class 'str'>, None, 50, 50, 0, True)
col: ('RATING', <class 'int'>, None, 5, 5, 0, True)
        ...

Read Data

Here’s a simplified example of reading data from the MS-Access database:

tablename = 'Membership Information'
sql = f'select * from "{tablename}" where number > 100000'
cursor = dbconn.cursor()
cursor.execute(sql)
for row in cursor.fetchall():
    # Note: field names are case sensitive
    do_somethine_with(row.NUMBER, row.Email)

Update Data

Here’s a simplified example of updating data in the MS-Access database. Of course, don’t put data in your SQL string; use ? placeholders to prevent SQL injections.

tablename = 'Membership Information'
sql = f'update "{tablename}" set "FIRST NAME" = ?, "Email" = ? where "NUMBER" = 106488'
sql_data = ['Joe', '[email protected]']
cursor.execute(sql, *sql_data)
dbconn.commit()