How are images stored in the database

jtti

Premium Member
Premium
Registered
Joined
Mar 5, 2024
Messages
20
Points
1
There are two ways to store images in a database:
Save the image to the database (the path where the file is stored on the server or the path of the FTP server)
The image is directly written to the database field in the form of binary data stream (base64 form), not recommended, the storage space is too large
The storage use of pictures in the database is generally

User upload profile picture, article illustration, article home page picture and so on
Other aspects of the picture
There are two general ways to store pictures:
The image is stored on disk (on the server), and the database field is the path where the image is stored on the server (this common practice is recommended)
The image is stored directly in binary form in the database, the general database provides a binary field to store binary data. For example, mysql has a blob field. In the oracle database, the type is blob or bfile
To convert images to binary storage:
General idea:

Convert the read image into binary form with your own program. (There are usually built-in functions that can be quickly converted to base64.) Nodejs can do this
insert the blob type field into the data table with the insert into statement.
When I pulled the picture from the database. Send the image content directly
The front-end then receives the binary and displays it to the desired location
In SUMMARY:
The handling of the code is not very cumbersome, and can be easily handled using nodejs. However, we use more is the path to store the image, the actual image is saved on the disk (image binary into the database, which makes the database burden). If you need code, you can see the processing of images in my nodejs.

In the Internet environment, large volume of traffic, database speed and performance are important. Generally, the practice of storing pictures in the database is less, more is to store the picture path in the database, and only need to connect the disk path to load the picture in when displaying the picture.

Because pictures are big fields. A picture to occupy 1MB or even dozens of MB, so the use of the database is a waste of resources, but if the amount of pictures is very small, you can try, or directly in the background to open up space to store files (which also caused a lot of pressure on the server), so it is best to use a third-party file upload platform, such as Qiuniuyun, Ali Cloud, Tencent Cloud, etc.

There are some basic database tuning involved, such as the article is divided into title, author, add time, update time, article content, article keywords, and so on.

The content of the article is generally relatively long. The text field is often used for storage. The content of the article belongs to the large field. General article content can be split into a separate table. Do not store information with the article in a table.

Personal understanding:
The data of a table in mysql is all in a data file, if the data of a large field is also stored in it, for example, the program display list, such as the article list, this time there is no need to show the content of the article, but it will still affect the speed. Database search data is actually to scan the data file, the smaller the file capacity, the faster the speed will be (why the capacity of a single table in 1GB - 2GB when basically divided table). Split into a separate table, that is, a separate file. The idea of mutual independence and separation is not only used in system development, but often exists in real life.

Summary: Three things you should never put in a database: pictures, files, and binary data
reason
The speed of reading/writing to the database can never keep up with the speed of file system processing
Database backups become huge and time consuming
Access to files needs to cross your application layer and database layer
Save thumbnails to a database? Good, then you can't use nginx or any other type of lightweight server to handle them.
About blob types in mysql
bolb(binary large object) Binary large objects are divided into blob, MEDIUMBLOB, and LONGBLOB, just like int. Actually, from small to big

The blob capacity is 64KB
The MEDIUMBLOB capacity is 16 MB
The capacity of LONGBLOB is 4G
To be honest, images are not used much in this way. Using java's serialization function to serialize the value, someone put into this field.

mysql blob field storage image has a communication size setting:
Images to be transferred to mysql storage, then need to involve data communication. mysql has a configuration that limits the size of communication data

conf configuration file max_allowed_packet, mysql default value is 1M

Many images, especially the original ones, may be larger than 1m. The data transferred (that is, the picture) exceeds this setting size. The result will be a mistake

In fact, the so-called performance, the most critical is the database performance. Because as the amount of data in the database increases, most of the time is spent in php, java and other languages waiting for the database to return the data.

After a large number of website visits, the specific language is not the bottleneck, the bottleneck is in the database. Use c, python, php, java can manipulate the mysql database to obtain data. There may be speed execution differences between languages, but these differences are small. At least I think, giving users feel less obvious. The execution difference of 0.0001 second does not make a significant difference to the user.

It may be said that when there is a large concurrency (many users accessing at the same time), the difference will be reflected. In fact, I think that large concurrent access is the database bottleneck. Wait for the database to give data. Not reaching a certain level really doesn't make a difference. The amount of database data reaches a certain level. A language difference of 0.001s can make a difference in user experience. So, this is why php is so good for web development.
Parsing pages is fast (interpreted language, no compilation required). You can use java to work with databases to get data. php does not directly manipulate the database, but calls the data interface provided by java, gets the data, and immediately displays it in the page. This takes advantage of the fast page execution speed of php.

2. Save the picture path in the database
It usually looks like this:

Generate a path by year, month, and day. The specific path is generated according to the year, month, day or year, according to your needs (not necessarily according to the date to generate).


Not even that slash. It is also convenient to expand the system later.
Need to remove the image path in the page display pictures, if it is a relative path, you can use. "/" + "images / 2012/09/25/1343287394783. JPG" for assembly.
If you need to separate the domain name (for example, do the CDN to accelerate) domain name, img1.xxx.com, img2.xxx.com domain names
Direct assembly "http://img1.xxx.com/" + "images / 2012/09/25/1343287394783. JPG"
Of course, the database can be saved with a slash /, /images/ 2012/09/25/1343287394783.jpg
In fact, it is not convenient to unify. For example, when the relative path loads the image, it is "." +” /images/2012/09/25/ 1343287394783.jpg”
Maybe I haven't realized the downside yet. I'll have problems later. Follow the convention, however, do not add slash "images / 2012/09/25/1343287394783 JPG".
Involves a new problem:
Why most of the system are not domain name saved in, like this http://www.xxx.com/images/2012/09/25/1343287394783.jpg saved to the database

The more knowledge we have, the better we can make decisions. It may be that under the influence of "feeling the difference is not very big", to make a decision, but the latter is relatively large impact. At least it increases my workload.
In fact, it is not a heinous crime to save the domain name in it. But no experienced developer would do that. This is an accumulation of experience, so the Shanghai netizen also has no obvious concept of this is normal, he said he did not know the cdn aspect (of course, it is not a big deal to store a domain name). It is necessary to understand the cdn knowledge and when to use the cdn knowledge.
Although you are a developer, you do not need to focus on operations and servers and other knowledge. But knowing a little helps.
This involves cdn acceleration. Regarding the principle of cdn (that is, content delivery network), I understand that its essence is to solve the speed problem caused by distance, and use the nearest service.
CDN
Requesting images from a server in the United States from China. Generally relatively slow, because the distance is so far, there is a loss of network transmission, the farther the distance, the longer the transmission time. Generally, you will see the lower-left corner of the browser display: "Responded, is transmitting data..." . It's not the server itself. In fact, the server has already responded to the request and sent the data to the client, but the network problem has been transmitting, and it has not finished.
In China, it's the distance between the north and the south. The north and the South will also involve the cross-network, the southern users use telecommunications, the northern users are mostly Netcom. Two lines need to be crossed. There's a time delay.
cdn acceleration is a response to this need: no servers in the United States are now requested. Directly in China to place nodes (node is a more general word, can be understood as a server, can also be understood as a machine room, is a point), request close nodes. Then you don't need that much distance.
Previously in Changsha's website, group purchase in the form of city sub-stations. Beijing and Changsha use the same procedure. The server is in Changsha. When users in Beijing visit Beijing Station, they actually need to access the server in Changsha from a distance. It's not gonna get any faster. It has nothing to do with server performance. Didn't understand that then. I don't know what to do. Find a way to do js code compression, browser caching and the like. Actually messing around. This is not to say that these front-end optimization is not important, there are philosophical contradictions between the primary and secondary, and where the bottleneck is to break through. The problem will not be solved without solving the main contradiction. It wasn't a database bottleneck. If you want to optimize the database. It won't be much better. That's the amount of data. You can't reach the bottleneck at all. Where can we talk about the principal contradiction. As I went to work in other companies and came into contact with some things, similar optimization examples of not finding bottlenecks occurred several times around me, and I went blind to optimize without finding bottlenecks first. My colleague may have done it with the mindset that more is better, but the main contradiction (technically the bottleneck) was not found or improved.

If I hadn't thought it was the distance. I wouldn't have thought about cdn, which I didn't even know about at the time. I only know that google these sites must be deployed in China's servers, otherwise, Chinese users also go to visit the United States server, then the best server will be slow.

Due to the relatively large funds to build their own cdn environment and computer room (requiring a large number of servers), it also requires human maintenance. Anyway, the average company can't afford it, in fact, it is not cost-effective. Taobao used to use commercial cdn services, and later commercial businesses could not carry it, so they built their own cdn network. I don't know if Sina has built its own, but in fact, I think it is related to the characteristics of Taobao, there are many shops, whether it is goods or transaction records, a lot of pictures of goods, pictures are static parts, cdn is originally used to do static (pictures, css, js, etc.) request distribution.

Before I saw a word on the Internet, the cdn network is not an ordinary company can afford to play.

General companies to build their own cdn network cost is high, so there are commercial cdn to provide paid rental services, this is a very mature business, many such companies, most of the national Internet companies will use cdn.

Summary: cdn service. It's great for static content. So like commodity pictures, with a large number of visits, rent cdn services, only need to upload the picture to their server.
Example: Singapore to access Hong Kong server, too far away. We can completely put the product pictures on the cloud service in Singapore (we think that the cloud storage now provided to the website is actually a cdn, providing the website with streaming and nearby access). In this way, when Singapore users visit, the image is actually obtained nearby. You don't need to travel very long distances.
 
Latest Threads
Replies
0
Views
40
Replies
2
Views
82
Replies
0
Views
100
Replies
1
Views
95
Recommended Threads

Latest postsNew threads

Latest Hosting OffersNew Reviews

Sponsors

Tag Cloud

You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an alternative browser.

Top