I remember my first massive spreadsheet project. My office was tasked with tracking the department’s annual budget, so I built an impressive network of spreadsheets. It had multiple linked spreadsheets, each with several tabs, auto calculations, stylish charts, PivotTables, auto-calculations and lookup formulas. It was brilliant! And then it crashed.
I had pushed my spreadsheets beyond their limits and now I had two fiscal years of data corrupted. What I did not know at the time was that the more sophisticated I made my spreadsheets, the more I taxed them. I should have sought a database solution.
The Pros and Cons of Spreadsheets
Spreadsheets have great features such as automatically recalculated formulas, stylish charts and graphs at the click of a mouse, pivot tables, sorting and filtering, and cell formatting. Microsoft Excel even has a “Format as Table” option that will instantly “pretty up” your dull data. The array of features available in spreadsheet applications makes displaying and analyzing large amounts of data easier. Spreadsheets are easy to use and flexible and inexpensive, which is why they have become the go-to business tool for storing and analyzing data.
As sophisticated as spreadsheets have become, they still have some serious drawbacks. Spreadsheets are not ideal for long-term data storage. They only offer simple query options, do not guard data integrity, and offer little to no protection from data corruption.
The New Guy, Databases
A database is similar to a spreadsheet. In the simplest terms, a database is a collection of tables, organized in columns and rows, just like a spreadsheet. The big difference is that in a database each table has a unique set of columns and rows, and different relationships can be made between the different tables. A relational database management system (RDMS) standardizes the way data is stored and processed. RDMS tables store data in a logical manner specifically designed to provide data integrity, reduce duplication, and minimize irregularities.
A lot of grief can be saved if you take the time to consider the parameters of your project before you start. When deciding if you should create a database for your project, or transfer your current spreadsheets to a database, here are a few things to consider:
- User Access: The number one reason for creating a database instead of a spreadsheet is if multiple people will need to access the file. Sure, you gave everyone a week to update the spreadsheet, but without fail a group of procrastinators will all try to do their updates in the last 30 minutes before the deadline, resulting in a mass of “file is locked for editing by…” error messages. This sort of traffic jam is prevented in a database because multiple people can make edits simultaneously.
- Scope: A spreadsheet is great for tracking a simple list, but will that list continue to grow and potentially become unmanageable? Databases are better for long-term storage of records that will be subject to changes. Databases have a far greater storage capacity than spreadsheets. If your spreadsheet exceeds 20 columns and/or 100 rows, chances are it would be better for you to use a database.
- Reports/Queries: If you have difficulty querying specific datasets for reports, a database could be the answer. When building a spreadsheet, that data is formatted and arranged to get the desired report when printed. With a database, the data and reporting features are separate, allowing you to generate multiple reports with the same data. For example, management wants to see company-wide sales records by quarter, the program manager only wants to see annual sales for her region, and the marketing department wants to see monthly sales by product type. Instead of maintaining three spreadsheets with customized views of each party, a database would allow you to use advanced queries to generate all three formats from one source – no copy and pasting needed!
- Data Integrity: Duplication of data is another reason for moving away from spreadsheets. Does changing one cell force you to update several others? Do people save independent copies of the spreadsheet, causing duplicate and often outdated versions? In a relational database, data is stored in one place which minimizes redundancy and saves space.
Remember that spreadsheets and databases are not mutually exclusive. Just because you upgrade to a database doesn’t mean you have to divorce your spreadsheets. In most cases, a combination of the two is the best. You can store your records in a database, allowing you to make advanced reports and queries. In turn, those reports and queries can be exported to spreadsheets for analysis.