I've been trying to figure this out, but have been coming up blank. For some reason, I can't figure out the right google term to use, and the MySQL books I have don't really address this.
I'm working with a website that will have many, many pieces of text I need to store in the database. Most of these texts will be less than 10,000 Characters, while a few will could be as great as 100,000 characters or more. I won't be able to know ahead of time.
The Scheme's I've come up with are as follows:
Use LongText or MediumText data type for all articles. My concern is that those majority of articles that are only < 10,000 will bloat the database. Please excuse my ignorance in the data storage techniques if this isn't right.
The other way is that as articles are stored in different tables based on their character count. I don't like this method because it would mean have three tables to store the same type of data, and I would need to reference that in another table.
Anyone have thoughts on this?
Thanks, Josh
On Mon, 6 Jun 2005 10:42:09 -0500 Josh Charles josh.charles@gmail.com wrote:
I've been trying to figure this out, but have been coming up blank. For some reason, I can't figure out the right google term to use, and the MySQL books I have don't really address this.
I'm working with a website that will have many, many pieces of text I need to store in the database. Most of these texts will be less than 10,000 Characters, while a few will could be as great as 100,000 characters or more. I won't be able to know ahead of time.
The Scheme's I've come up with are as follows:
Use LongText or MediumText data type for all articles. My concern is that those majority of articles that are only < 10,000 will bloat the database. Please excuse my ignorance in the data storage techniques if this isn't right.
The other way is that as articles are stored in different tables based on their character count. I don't like this method because it would mean have three tables to store the same type of data, and I would need to reference that in another table.
Anyone have thoughts on this?
While I'm not familiar with MySQL internals specifically, most databases ( like PostgreSQL ) store large data like that outside of the normal "table" in an overflow table. This is because it is impossible ot predict how large the data is going to be and thus you can't pre-allocate space for it.
My suggestion would be to try it out and see what happens. If it does cause the table to bloat look into switching to PostgreSQL which doesn't have that problem.
--------------------------------- Frank Wiles frank@wiles.org http://www.wiles.org ---------------------------------
On Mon, 6 Jun 2005, Josh Charles wrote:
I've been trying to figure this out, but have been coming up blank. For some reason, I can't figure out the right google term to use, and the MySQL books I have don't really address this.
I'm working with a website that will have many, many pieces of text I need to store in the database. Most of these texts will be less than 10,000 Characters, while a few will could be as great as 100,000 characters or more. I won't be able to know ahead of time.
My understanding is that TEXT type fields don't allocate anything unless they need the space to store actual data, essentially like VARCHAR. So, TEXT, LONGTEXT, MEDIUMTEXT, etc. differ only in their maximum data size, and the actual mysql table would be essentially the same size unless these maximum sizes were exceeded in a field and thus truncated.
So, if I understand your question correctly, if you want to store all possible honker articles, you can use LONGTEXT and this will only add a few extra bits to each data field.
Okay, here's mysql's table of capacity:
TINYTEXT: 2^8 TEXT: 2^16 MEDIUMTEXT: 2^24 LONGTEXT: 2^32
So, I guess you'd need MEDIUMTEXT to store articles of up to 16,777,216 characters, and there'd be no bloat.
LONGTEXT could go to 4,294,967,296 characters, and still no bloat.
I belive that TEXT field suppport is fairly new in mysql, or at least I haven't known about it for very long.
Regards,
-Don
On Mon, 6 Jun 2005 14:02:04 -0500 (CDT) Don Erickson derick@zeni.net wrote:
Okay, here's mysql's table of capacity:
TINYTEXT: 2^8 TEXT: 2^16 MEDIUMTEXT: 2^24 LONGTEXT: 2^32
So, I guess you'd need MEDIUMTEXT to store articles of up to 16,777,216 characters, and there'd be no bloat.
LONGTEXT could go to 4,294,967,296 characters, and still no bloat.
I belive that TEXT field suppport is fairly new in mysql, or at least I haven't known about it for very long.
So MySQL has a limit on the max size a LONGTEXT can hold? Interesting...
--------------------------------- Frank Wiles frank@wiles.org http://www.wiles.org ---------------------------------