Convince me of the advantages of L4 migrations

replied 12 years ago

There's no reason for someone to quickly be able to "understand what's going on" by glancing at your migrations any more than for someone to quickly understand what's going on by glancing at all of the commits in your git repository. It's not a workflow that focuses on a single state of the database. But, rather it is about continuously keeping it up to date. Having small 'commits' to your schema changes allows you to have this. It's easy to understand each individual change in its own context. If you want to know exactly what your schema looks like when you're done running migrations, just run them. I personally have no use for some completely up to date sql file that creates my DB schema.

Yes, you're talking about thousands of migrations over a couple of years. And again, I'm not sure where the desire comes from to be able to look at your migrations and know what's going to come out the other side. That's not what migrations are for. It's so that your coworker can pull down the repo and run the script and only have the specific changes be made to their system. Or, to run the migrations against production. This is important, especially if you're utilizing a branching workflow. You can switch feature branches and handle different forking db schemas for working in the branches, then when they're merged, it's all put together. The atomic nature of the migrations is the primary feature.

If you don't use the rollback feature then you're not able to work with different schema versions. You're also not using the migration workflow. I think that you really need to try to use the workflow correctly for a while. Your idea of migrations is that they set up the database for you. But, that's not really the core purpose. You can set up your database in Navicat or Sequel Pro or whatever. You can use Navicat to sync your dev db structure to production if you want, etc. The purpose of migrations is to have the atomic changes and to be able to flexibly migrate your database schema. You can bring your db schema up to a certain point (for example, for a feature branch) then bring it back to where it was for any other purpose. (working in a different branch)

Your point about the quick database refresh assumes that you don't want to be working with persistent data, etc. It's actually very limiting if you think about it.

I've seen people use migrations as just a way to create a database schema. Like, you want to make a change to your schema and then you go edit your migration files, destroy your database, and rerun. I'm sure that you can see how this doesn't allow your team reasonable control over schema changes. When you want to bring your database up to date with someone else's work, destroying all of your data is a very poor option. Having small atomic alterations made to your dataset is much more accurately what someone is looking for in that situation.

If you're going to keep going back and editing your migration files, why bother? Why not just store a schema.sql file in your repository and keep updating it. You can have all of the 'drop table' syntax in there so that all you have to do is run it, it'll wipe all of your data, and set up the schema.

Then, I guess in production you better have a nicely maintained list of changes you've made to the database before you deploy new code that's been in production for a month by a team of 4 developers. Because, if you miss something you're going to have some problems. The atomic nature of the alter statements that migrations generate allows you actually test the database changes from a script, not from a faulty human.

The workflow needs to be like this.

generate a new migration for all database schema changes. if you haven't yet pushed a new migration file to remote (where other developers can pull it down) you can continue to rollback that change and alter the migration file. But, once you push it to remote then further changes will need to be in a new migration file. Otherwise, you're causing havoc and slowing down other developers.
after creating the migration, run it forward (php artisan migrate) and test that your dev db is altered in the way that you intend, then roll it back (php artisan migrate:rollback) to test that it rolled back correctly. Then, run it forward one last time nad continue your work. This way your migrations are again, not causing problems for other developers.
when you pull down the code of others, you can run a migrate forward, db:seed, and a composer install (i have a script that does this) then you are up to date and good to go. You didn't have to think about the schema and you didn't have to destroy all of your development database.

To me it seems like you're not working in a team where these workflows are followed, so it's hard for you to see the benefit. Maybe you work as the only developer on a project, I don't know. But, a migration system as you describe (as i understand it) would be useless to me as I primarily work in teams.

Last updated 4 years ago.

0

codenamegary

replied 12 years ago

Doctrine handles schema changes quite a bit differently, more inline with what you're describing as your more preferred workflow, you should take a look.

As for Migrations, I agree with ShawnMcCool. I find them useful and the key is the atomic nature of things. Previous to migrations (many moons ago) I'd use MySQL workbench to sync schemas with models and things, that was a nightmare and difficult to share between a team, migrations work way better IMO.

I don't think migrations are perfect, but they do make working in a team quite easy. Like I said, Doctrine has a different approach which is more similar to what you're after but the learning curve is a lot steeper, I've only used it on a couple of projects but I really did like it.

Last updated 4 years ago.

0

jakobud

replied 12 years ago

Thanks for the response Shawn. A couple points/questions:

There's no reason for someone to quickly be able to "understand what's going on" by glancing at your migrations

If anyone ever asked me a troubleshooting question about their project, and I wanted to look at the database structure I could easily look at their Schema Builder code and see it. But not if their table changes are spread out over dozens of files.

Having small 'commits' to your schema changes allows you to have this

Isn't that the same thing as using source control for your schema changes?

... if you're utilizing a branching workflow. You can switch feature branches and handle different forking db schemas for working in the branches, then when they're merged, it's all put together.

But my source control is going to merge together Schema Builder changes too.

Also, if you have multiple developers each working on their own branch at the same time, if they both add several migration files around the same time, since those files are timestamped and that determines their execution order, what are the chances that when you bring it all back together and run your migration, that something in your tables gets screwed up because the order of executed migrations is now different? You might have originally created your migration files A, B and C, that execute in that order. But if your coworker created migration F around the same time, the resulting order might end up being A B F C.

Taking that a step further, what if in migration F, your coworker changed or deleted some table column that you were depending on in your migration C? Your migration would now fail and would take some time for you to figure out what went wrong, since it worked fine on your branch. If you were simply editing a common Schema Builder file, the overlapping changes would be more obvious in the merge. When you migration failed that this point it would be easy to open up the file or look at your git history and see "Oh the column was removed from line 22 on that merge". Troubleshooting a dozen migration files that were merged together in a mixed order would take some time, to say the least....

Your point about the quick database refresh assumes that you don't want to be working with persistent data

Can you clarify what you mean? When you are talking about persistent data, do you mean that people out there develop sites using non-test data that is important and that they need to be persistent?? I guess that's a dangerous and foreign concept to me that I wouldn't want to touch with a 10 foot pole. Any data that I work with on a development site is temporarily and should be able to be instantly recreated using database seeding (save for the occasional form submission for testing).

edit your migration files, destroy your database, and rerun. I'm sure that you can see how this doesn't allow your team reasonable control over schema changes

Not if you are using source control correctly...... Or am I missing something?

Why not just store a schema.sql file in your repository and keep updating it.

I've come to love the Schema Builder usage. Quicker. Easier to read. Harder to make mistakes.

Then, I guess in production you better have a nicely maintained list of changes you've made to the database before you deploy new code

Again, isn't source control an accurate history of changes you make to your files? In this case, it would be a file with Schema Builder code that has evolved over time?

 The atomic nature of the alter statements that migrations generate allows you actually test the database changes from a script, not from a faulty human.

This went over my head. Not sure what you mean here. Are you talking about unit testing? If my tables were created using Schema Builder code, wouldn't it allow the same thing? Confused a bit on this.

Maybe in the end, it really is just my workflow that is keeping me from seeing the advantages.... I'm really not sure at this point.

Last updated 4 years ago.

0

shawnmccool

replied 12 years ago

I do think that it's the workflow.

Last updated 4 years ago.

0

jakobud

replied 12 years ago

Shawn, when you start a new site, and you are in the process of designing and creating your database tables, what is your workflow?

Last updated 4 years ago.

0

pixelpeter

replied 12 years ago

Well I'm not shawn, but here's my workflow

When I start with a new project and I don't know jet exactly what my tables will look like, I will do a complete rollback, change the migration, modify the seed (if necessary) and migrate again with

artisan migrate --seed

So I don't get lost in to many migration files because I decided to change e.g. a varchar from 255 to 150.

After I have a stable (and commited) version of the app I will use additional migrations to make the slight changes (like adding a new column, setting an index).

That's so much easier to manage between team members and multiple systems (where the deployment is done by somebody else).

The only additional work to do is to generate the sql statements for a change in a live deployment where we use the percona tools to make the change on a 16 million row table because we don't want to lock the table.

I'm pretty happy eith the migrations because we run into too many problems because of different schema versions on different development machines and live system.

Last updated 4 years ago.

0

shawnmccool

replied 12 years ago

I create the migrations and continue to update them up to the point where I push them to a remote. Then, if I need to make a change, I create a new migration.

I create one migration per table, never more. if you have an operation fail in a migration, then the whole migration fails. If your migration fails halfway through, then you can never quick roll back correctly because some modifications have been made that the system isn't aware of.

Last updated 4 years ago.

0

valorin

replied 12 years ago

Jakobud, it sounds to me like your workflow involves developing the site and the SQL *before it goes live, and then once it's live you don't change your SQL?

You can use the migrations for this - simply edit the initial create table migration files as many times as you need until your site goes live. It's the same thing, only the information is stored in a different format.

Once it goes live and in production, well, I don't think you described how you update the SQL Schema in a production system? But if you were using migrations, then updating the SQL Schema involves making a new migration, and applying it. Very easy for anyone to apply, and very easy to test.

So the real benefit for the migrations for me is that you can apply updates to a production database by running a script, rather than manually editing the SQL. (Which I've done in a big legacy product... it's painful!)

Does that make sense?

Last updated 4 years ago.

0

garbee

replied 12 years ago

The rollback feature is incredibly useful in development. For example, you make a table with all the columns you think you will need. Then as you are adding something else, you realize you need a new foreign key. So instead of having to reset the DB completely, you can just rollback that table creation, edit the migration for that table, then re-migrate & seed. That way you keep the minor migrations down for things that can be easily handled in dev with a rollback.

In my opinion, you should only create a migration for modifying things after they are already running in production in another way. Before that, you can always rollback the schema and fix it directly.

The rollback feature may also help you with the teammate creating migrations and timestamp problems. When they make a commit to the main dev branch, you should rollback your branches mods and rebase. Then remigrate and see if any conflicts arise, if so, fix it.

Last updated 4 years ago.

0

jakobud

replied 12 years ago

Thanks for the insight guys.

Okay so it sounds like it's common practice for a lot of people to actually go back and make changes to migration files. I wasn't expecting to hear that. I was under the assumption that any change you need to make to the db, no matter how small, needs to be done in a new migration. Like if I was deep into a project, lets say there are 3 dozen migration files. Going back and making a change to the first migration file seems like that would introduce problems. At this point you are required to do a full db refresh to get the latest changes....

Do people typically run migrations on production servers? How does this workflow go?

For example, lets say you have website version 9 deployed on the production server. You just finished developing and testing website version 10 on your dev server and now it's time to deploy to the live server. Version 10 has changes in the code and some new migration files in order to add or remove some columns from some tables.

If you simply deploy your new site to the production server, before you have a chance to run your migrations, the site is going to be broken in places because it's looking for database columns that aren't there yet.

If you deploy the site to a staging environment that connects with the production database and then run the migrations, your current live site (version 9) might break due to database changes that it's code can't handle. That is, until you actual deploy version 10 code to the production environment.

So, what exactly is the proper workflow for a situation where your app code changes and your database needs altered via migrations when you update a live production site with production data in the db?

Last updated 4 years ago.

0

codenamegary

replied 12 years ago

Jakobud said:

Thanks for the insight guys.

Okay so it sounds like it's common practice for a lot of people to actually go back and make changes to migration files.

I don't think anybody said that in the context you're taking it. It's common to edit a migration a bunch of times before you push it to a remote repository, but once it's there (if you're working with other devs) and certainly once it's deployed to production, you don't change it. You create new migrations.

Do people typically run migrations on production servers? How does this workflow go?

You have a deploy script that runs migrations before making your new code "live".

For example, lets say you have website version 9 deployed on the production server. You just finished developing and testing website version 10 on your dev server and now it's time to deploy to the live server. Version 10 has changes in the code and some new migration files in order to add or remove some columns from some tables.

If you simply deploy your new site to the production server, before you have a chance to run your migrations, the site is going to be broken in places because it's looking for database columns that aren't there yet.

If your schema changes break the existing site, your site should go into maintenance mode while you deploy the changes. You could, in theory, write something into your deploy script that does this automatically, puts the site into maintenance mode while the schema changes are being applied, and until the new version of your code goes live. This is going to happen regardless of what method (migrations, manually applying changes or otherwise) that you use to update the schema.

If you deploy the site to a staging environment that connects with the production database and then run the migrations, your current live site (version 9) might break due to database changes that it's code can't handle. That is, until you actual deploy version 10 code to the production environment.

Only production should ever, EVER touch your production DB. If something else is touching production, that's bad. In an ideal world, you should have some type of testing / dev environment whereby you test out everything before you push to production so that you know it works.

So, what exactly is the proper workflow for a situation where your app code changes and your database needs altered via migrations when you update a live production site with production data in the db?

The same as anything else, regardless of how you're altering the database. You throw it into maintenance mode (if required), apply your changes, and then turn things back on.

Last updated 4 years ago.

0

jakobud

replied 12 years ago

Thanks for the input guys. I'm warming up a bit more back towards migrations. Thanks.

Last updated 4 years ago.

0

garbee

replied 12 years ago

Yea, a staging site that touches the production DB. An absolute sign of bad design. The staging server should reflect the production one, but not share the same data store. I wouldn't even copy live data out of production for anything except backups. What happens when you test a change in staging with users and accidentally send an email out to all the users? Whoops, you just sent an email to what could be a few hundred to a couple thousand people that they never asked for.

With your software "versions" thinking, migrate what you need in the version dev. Then if there is any need to change it, rollback and edit directly. Once that version goes live, consider those migrations locked and then start a new set for the next version. And using maintenance mode for applying the changes is dead on. Depending on how long it takes though you could just git pull and then migrate right away and no one would notice the difference on the live site.

Last updated 4 years ago.

0

Moderators