MySQL Databases
UPHPU Meeting
May 18, 2004
Presented by Mac Newbold
mac@macnewbold.com
What is MySQL?
- Free, Open Source, High Performance Database Engine
- It's popular too, and works great with PHP
- LAMP: Linux, Apache, MySQL, PHP
- Based on SQL Standard
- Probably at least as compliant as other DBs
- Versions: 3.23.xx, 4.0.xx Stable
Who am I?
- Mac Newbold - mac@macnewbold.com
- Background: BS CS '01 + MS CS '04 at U.Utah
- Self employed consultant - MNE LLC
- Lately, mostly high-end PHP/MySQL websites
- Spent last 4 years working with MySQL, in combination with PHP, Perl, and C
- Never used PostgreSQL, some experience with Oracle
Biases and Disclaimers
- I like MySQL - a lot
- For me: reliable, capable, featureful
- I haven't used PostgreSQL
- Probably should try it sometime
- So far have never needed to
- Most familiar with 3.23.xx
- Haven't tried some new features from 4.x, 5.x
- Others here likely know more than I
Why MySQL?
- Most popular open-source DB
- Fast, reliable, scalable, etc.
- Integrated with common applications
- Why not?
- Some advanced features are new in MySQL
- Some think PostgreSQL or Oracle is better
- Some people think "free" == "cheap"
The Plan
- Introductions
- Where to learn the basics
- MySQL Table Types: MyISAM, InnoDB
- Database Design Tips and Tricks
- Advanced SQL
- Joins, Subqueries, Triggers, Transactions, Foreign Keys, Stored Procedures
- Optimization Techniques
Where to Learn the Basics
- #1 most important thing: MySQL Manual
- www.mysql.com or dev.mysql.com
- For basic SQL, ask Google: "sql tutorial"
- I'm not planning on covering:
- Basics of installing, administering, using it
- DB Security
- Caching, replication, clustering, etc.
- (We can talk after or on the list)
MySQL Table Types
- Choose one to match your needs
- MyISAM
- Default - most common, most familiar
- InnoDB
- Supports some newer features:
- Transactions, row-level locking, foreign keys
- High volume, high performance
- BDB tables support transactions too
Database Design - Relationships
- Entity-Relationship (ER) Model
- Relationships mapped to DB tables:
- 1-to-1: 1 SSN/Person, 1 Person/SSN
- Usually a column - SSN column in person table
- 1-to-N/N-to-1: One category, many items
- Usually a column in separate table - category table, plus item table with category column
- N-to-N: bank accounts and account owners
- Usually three tables - accts, people, plus ownership table with (acctnum, personID)
Database Design - Keys
- Choosing Primary and Unique keys
- Primary key is usually the minimum set of uniquely identifying attributes
- Include too much ==> unintended duplicates
- Include too little ==> false rejection
- For performance, smaller is better
- Sometimes worthwhile to make unique IDs
Database Design - Keys 2
- Table person == (first, last, dob, ssn,...)
- Key (first, last) == ?
- Key (first, last, dob) == ?
- Key (first, last, ssn) == ?
- Table person == (id, first, last, dob, ssn,...)
- Pros and Cons?
- Size difference, speed difference
Database Design - Efficiency
- Two primary aspects
- Space - disk space, memory footprint
- Time - lookup speed, insert/delete speed
- Know your read vs. write mix
- Redundancy in your data
- Pros: easier recovery, maybe faster
- Cons: harder to keep in sync, larger size
Database Design - Integrity
- "Referential Integrity" - Consistency
- Foreign Keys - identifier in one table used to find rows in another table (e.g. with JOIN)
- Constraints are all application specific
- Ex.: Bank - loans and loan payments
- What if Loan ID in payment table doesn't match any rows in Loans table?
- Foreign key constraints can be enforced
- Then insert/delete order matters a lot
Database Design - Advanced
- Triggers and Stored Procedures
- Let database handle some of the logic/code
- Nice if you have to use multiple interfaces
- I.e. PHP web site, Java GUI app, Perl billing system
- New versions of MySQL support them
- Views - "virtual table" from query result
- Useful for simplification, security, privacy
- Usually can't do inserts
Joins
- Relate one table to another
- Select * from items as i left join category as c on i.catid=c.catid where item="foo";
- New: can use for delete/update too
- Many types of joins
- Some more useful than others
- Used in almost every database application
Join Types
- Inner Join - rows that have a match
- Omit row if either table doesn't have a match
- Specify field(s) to match on
- Outer Join - fill in non-matches with null
- Left Join, Right Join, Full [Outer] Join
- Specify field(s) to match on
- Cross Join - every possible pair of rows
- Huge result, not used often
Join Types 2
- Natural Join - join fields with same name
- In MySQL, same as inner join, or use Natural Left Join to get a left join
- Left/Right/Full Joins
- Select * from A left join B on A.id=B.id
- Left gets every row from A
- Right gets every row from B
- Full gets every row from both
MySQL Join Syntax
- Basic: "from A [join-cmd] B on [join-cond]"
- Handy: "from A join B using (c1,c2)"
- Same as "on a.c1=b.c1 and a.c2=b.c2)"
- Cross: "from A,B" or "A CROSS JOIN B"
- Inner: "A [INNER] JOIN B"
- Outer: "A [LEFT|RIGHT] JOIN B"
- Full: Use "UNION" in 4.x with left+right joins
Fancy-Pants Join Example
- Join a table with itself
- Useful with tree-like structures, for example
- Table cat = (id, name, parent)
- Parent holds id of parent category
- Select c1.id, c1.name,c2.id, c2.name from cat as c1 left join cat as c2 on c1.parent=c2.id;
- Pulls id/name for category and its parent
Subqueries
- Select * from t1 where col1= (select col2 from t2);
- More structured, and readable too
- Alternative to complex joins and unions
- Available in MySQL <= 4.1.x
- Most subqueries can be rewritten as joins
- But they're still nice to have available
Subqueries 2
- Default: true if any subq row matches
- Additional keywords for subqueries:
- IN (alias for =)
- ANY / SOME (optional, same meaning)
- ALL (match all rows instead of any row)
- Select c from t where c > ALL (select c2 from t2)
- EXISTS / NOT EXISTS (subq has results)
Subqueries 3
- Row subqueries:
- Select * from t1 where (1,2) = (select c1,c2 from t2);
- Subqueries in the FROM clause:
- Select ... from (subquery) as t1 ...
- Derived tables, a.k.a. unnamed views
Rewriting Subqueries
- When you can't or don't want to use one
- MySQL < 4.1, performance, portability, etc.
- Select * from t1 where c in (select c from t2)
- ==> Select t1.* from t1,t2 where t1.c=t2.c
- ==> Select t1.* from t1 join t2 on t1.c=t2.c
- Select * from t1 where c not in (select c from t2)
- ==> Select t1.* from t1 left join t2 on t1.c=t2.c where t2.c is null;
Other Advanced SQL
- Stored Procedures (5.0) & Triggers (5.1)
- Put some app logic into db
- Transactions
- Multiple queries as an atomic operation
- Use "commit;" to save, or "rollback;" to cancel
- Can affect performance significantly
- Foreign Keys
- DB enforces constraints on "joinable" columns
Optimizing MySQL Queries
- Common bottlenecks - any guesses?
- Disk seek time, disk bandwidth, CPU, memory bandwidth, sometimes network bandwidth
- Optimization tools
- Benchmarking - Benchmark(1000000, 1+1)
- EXPLAIN - optimizations, indices, etc. in use
- Add indices to your tables appropriately
- Minimize locking
EXPLAIN Explained
- Example:
- EXPLAIN select * from user acct as ua left join user_memb as um on ua.uid = um.uid left join comp_acct as ca on um.cid = ca.cid;
- Result has (table, type, possible_keys, key, key_len, ref, rows, extra)
- Type is const, eq_ref, ref, ..., range, ..., all
- Keys and rows are useful too
- Extra indicates sorting, temp tables, etc.
Easy Optimization Tips
- Make sure your "joinable" columns have the exact same type in both tables
- Add appropriate indices if explain shows that mysql isn't finding one it should be able to use
- Enable the slowquery log, and use it
- Table order in joins has big effects
- Fixed size vs variable size fields
Question and Answer
- If you've got questions, I have answers
- Answers: Free
- Reasonable Answers: $5
- Correct Answers: Priceless
- Thanks for coming and participating!
- I'm on IRC in #uphpu often, or
- Email me at mac@macnewbold.com