Suppose I have this table in live prod for many years:
[UserContactInfo]
[UserContactInfoId] uniqueidentifier not null
[UserId] uniqueidentifier not null PK to [User] table
[FirstName] varchar(50) null
[LastName] varchar(50) null
[Street] varchar(200) null
[City] varchar(50) null
[State] varchar(2) null
[Zip] varchar(10) null
many other fields
Currently total max row byte count is currently only 1K -- far less than 8K per-row limit.
Currently there are around 10 million rows or about 9GB.
Let's say that data-migration is very painful due to the large table size , the necessity of running all operations in transactions , and the business negatives of extensive database maintenance window.
Now I want to add the mailing address.
Options 1 : I could either add additional columns:
[MailingStreet] varchar(200) null
[MailingCity] varchar(50) null
[MailingState] varchar(2) null
[MailingZip] varchar(10) null
Option 2 : I could type the address:
[AddressType] byte not null
corresponding to a C# enum enforced on write
enum AddressType {
Physical=1,
Mailing=2}
and during db maintenance job run a script to update all existing rows to [AddressType]=1
My question is which option will be better performing?
Option 1 negatives:
(a) initially all rows will have 4 unused columns and likely even after many years it will be low % of rows with actual data. But I believe SQL Server only requires a single bit to record that a col is null. (b) it moves the row slightly closer to the 8K boundary
( albeit still a long ways off ) (c) if we need another address type it means more cols
Option 2 negatives:
(a) potentially sometime in the future the table could have twice as many rows. I could add a compound index on [UserId],[AddressType] and I'm sure SQL Server has various tricks to optimize performance, but could I doubt it would ever approach the performance
of a table half the size?
Does anyone have better insight into the trade-offs of Option 1 vs Option 2?