Casting populated table column to Enum in Flask with SQLAlchemy

Photo by Chris Lawton on Unsplash

Implementing enum in Python is not that hard. But when you have a populated database and you want to change a type of column to enum, there are multiple things can go wrong. In this post I will go through steps how one can change a column of string type to enum without damaging current state.

Let’s say we have a form in the web page for user. The form is backed with table user_info with a column employment_type. The form should have 3 choices and user should choose one : Employed, Not employed, Entrepreneur.

What we did initially was making a column employment_type as a string, and create the choices in the form separately.

NB! Most of the code is simplified for the sake of example.

class UserInfo():
__tablename__ = ‘user_info’
employment_type = db.Column(db.String())

And the current form looks like this:

class UserForm():    employment_type = SelectField(
'What is your employment type?',
coerce=str,
default='Employed',
choices=(
('Employed', 'Employed'),
('Not employed', 'Not employed'),
('Entrepreneur', 'Entrepreneur'),
),
validators=(
AnyOf(('Employed', 'Not employed', 'Entrepreneur')),
),
)

Now we make a decision to switch the employment_type from string to enum. Why? Because in current state of database, any kind of string can end up in employment_type column, but our business logic is to limit that to 3 choices.

1. First, we can create the enum as a separate class.

class EmploymentChoices():
EMPLOYED = 'Employed'
NOT_EMPLOYED = 'Not employed'
ENTREPRENEUR = 'Entrepreneur'
class UserInfo():
__tablename__ = ‘user_info’
employment_type = db.Column(db.Enum(EmploymentChoices))

There is something wrong here. In SQLAlchemy, only the string names are persisted in database, not values. Let’s check names of the enum,

for member in EmploymentChoices:
print(member.name)
> EMPLOYED
> NOT_EMPLOYED
> ENTREPRENEUR

This means that, our employment_type column will be consisted of only three strings, which are : EMPLOYED, NOT_EMPLOYED and ENTREPRENEUR. But we don’t want that, because our current database is populated with string which correspond to values of the enum members:

for member in EmploymentChoices:
print(member.value)
> Employed
> Not employed
> Entrepreneur

In order to overcome this barrier, we can use additional parameter in the declaration of the employment_type column.

class UserInfo():
__tablename__ = ‘user_info’
employment_type = db.Column(db.Enum(EmploymentChoices, values_callable=lambda x: [str(member.value) for member in EmploymentChoices]))

With the help of Enum.values_callable we can persist the values of Enum members instead of their names.

2. Implement the form.

Now we have to modify our previous form. Instead of using SelectField with manually created choices, we can fetch the choices with QuerySelectField. To handle the query more easily, I added staticmethod to EmploymentChoices:

class EmploymentChoices():
EMPLOYED = 'Employed'
NOT_EMPLOYED = 'Not employed'
ENTREPRENEUR = 'Enterpreneur'
@staticmethod
def fetch_names():
return [c.value for c in EmploymentChoices]

This implementation of field is also called ORM-Backed Fields. More about it

class UserForm():
employment_type = QuerySelectField('What is your employment type?', query_factory=database.EmploymentChoices.fetch_names,
get_pk=lambda a: a,
get_label=lambda a: a)

3. Migration! Alembic? Alembic! 😴

Now, here comes migrations aka the headache. I use alembic to automatically generate migration files, but unfortunately, it doesn’t always catch the changes. Therefore, I had to write the migration file manually:

employment_choices = sa.Enum(
'Employed', 'Not employed', 'Entrepreneur',
name='employment_choices'
)

def upgrade():
employment_choices.create(op.get_bind())
op.execute('ALTER TABLE user_info ALTER COLUMN employment_choices DROP DEFAULT;')
op.execute('ALTER TABLE user_info ALTER COLUMN employment_type TYPE employment_choices USING employment_type::employment_choices;')
with op.batch_alter_table('user_info', schema=None) as batch_op:
batch_op.alter_column('employment_choices',
server_default='Employed',
nullable=True)

def downgrade():
op.execute('ALTER TABLE user_info ALTER COLUMN employment_type TYPE VARCHAR;')
op.execute("DROP TYPE employment_choices")

At the end of day, implementing enums on populated table is not that hard, but still requires some manual labor.

I talk to myself all the time.