RAID in Production
RAID 5 is the configuration most commonly used in production environments. However, it comes with limitations you should be aware of.
RAID 5 is not well-suited for transactional databases because it performs poorly with write-heavy workloads. The reason is simple: RAID 5 is optimized for reads, not writes.
If you are dealing with frequent transactions, RAID 10 is recommended. It provides better performance for writes, though it is significantly more expensive due to disk requirements.
Factors for Capacity Planning
When planning for capacity, it's important to focus on workload rather than just storage volume. Here are the critical factors to consider:
Drive Space
Do you need Direct Access Storage? Think about whether your application might rely on single drives while supporting thousands of concurrent users.
Disk Controller Limitations
The controller can become a bottleneck if it isn't able to handle the volume of reads and writes required by your workload.
Planning for Workload, Not Volume
Always plan for the type of work being done—not just how much storage you have. A system with terabytes of space is useless if it can't handle the IOPS needed for your application.
Assessing Workload
So, how do you actually measure and assess workload for proper planning?
- Use Windows Performance Monitor to check the number of database disk reads and writes generated by an application.
- Check the IOPS (Input/Output Operations Per Second) of the individual disk. This value is provided by the hardware manufacturer. A good baseline to keep in mind is 125 IO per disk.
Calculating Disks and Controllers for Maximum Throughput
To achieve maximum throughput, you'll need to determine both the number of disks and controllers required.
You'll also need a RAID adjuster in your calculations. For example, RAID 10 writes data twice, so the RAID adjuster value will be 2.
Formula:
Required Disks = (Reads/sec + (Writes/sec × RAID Adjuster)) ÷ Disk IOPS
Using 125 as the IOPS value is a good rule of thumb unless specified otherwise by the manufacturer.
Protocols Used in Databases
Different protocols are used to connect and communicate with databases. Common ones include:
- Named Pipes
- Shared Memory: Mostly used for debugging purposes.
- TCP/IP: The most widely used for network-based communication.
Oracle: Database vs User vs Schema
In Oracle, the concepts of database, user, and schema often confuse beginners. Here's a simple breakdown:
- In Oracle, users and schemas are essentially the same thing.
- A user is the account you use to connect to the database. A schema is the set of objects (tables, views, etc.) that belong to that account.
- When you create a user with the
CREATE USERstatement, you also create the corresponding schema (initially empty). - An administrator can then grant privileges to that user, allowing it to create tables, run queries, insert data, and more.
- The database itself contains all the users, their data, and predefined system objects (tables, views, system users) that make Oracle function.
For a deeper dive into Oracle's architecture, check the official documentation:
Oracle Database Concepts Guide
Final Thoughts
Capacity planning isn't about just buying bigger disks—it's about understanding workload patterns, RAID trade-offs, and controller limitations. By measuring IOPS, calculating disk requirements properly, and knowing how your database handles connections and schemas, you can build a setup that is both reliable and performant.