SlideShare a Scribd company logo
1   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Debugging and Configuration
Best Practices for
Oracle Linux

Greg Marsden
Senior Director, Linux and Virtualization



2   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Agenda


         Key Linux Tips and Tricks
         Common Issues
         Diagnostic Tools and Use Cases
         Do it Yourself Debugging
         Ksplice in the Datacenter



3   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Tips and Tricks:
        Key Points




4   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Key Linux Tips and Tricks



Kernel Tuning: Oracle
   Preinstall RPM
                                                                                     Best Performance
  Diagnostic Tools:                                                                   and Reliability
kdump and oswatcher




 5   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Oracle Preinstall Package and Templates
        Configure Oracle Products Automatically


           oracle-rdbms-server-11gR2-preinstall-1.0.6.el6.x86_64.rpm
           Per-Product Preconfiguration Package
                    – Based on Validated Configuration‟s real world stack testing
                    – Includes Product Release Notes recommendations
                    – Installs necessary dependencies and kernel tuning parameters
                    – Individual for each Oracle product
           Oracle VM Template for Oracle RDBMS Server
                    – Production-ready, installed virtual machine templates from eDelivery


6   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
System Diagnostics
        Critical Diagnostics Software should run at all times


           oswatcher utility: Install and leave running to collect over-time
            information about system activity.
           serial console or netconsole to remotely monitor system activity in
            the case of a disk, network or system outage.
           kexec crash collection utilities to gather forensic information from
            malfunctioning systems.




7   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Tips and Tricks:
        Memory Management




8   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
“Help! my system has 250 GB of RAM I‟m running out of
         memory! My consultants are telling me we can‟t scale
         with a 120GB SGA and this many connections, but I
         can‟t fit any more RAM in this system.”
           Anonymous DBA
           Oracle User




9   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Issue: Not using Hugepages

         Frequent Issue                                                              I found the following:
                                                                                     .
• Symptom: Out of Memory Errors, slow                                                13:09:19 57591060k free      159 client connections
  performance. Detected via                                                          13:26:01 26189944k free      1826 client connections
  oswatcher.                                                                         13:32:31 15547144k free      2024 client connections
                                                                                     13:57:00 467048k free       2037 client connections
• Cause: SGA mapped in 4k pages                                                      (here is where we begin swapping memory to disk)
  instead of 2MB
                                                                                     I also found this:
• Solution: Use Hugepages
                                                                                     zzz ***Fri Aug 9 13:23:22 PDT 2011
         • Hugepages are faster.
                                                                                     MemTotal: 250 GB
         • Hugepages are “pinned” and won‟t                                          MemFree: 464 MB
           be swapped.                                                               PageTables: 112 GB




10   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Performance with Hugepages


         Without Hugepages                                                           With Hugepages
          • 200 Connections to a 12.9GB SGA                                          • 200 Connections to a 12.9GB SGA
          • Before DB Startup Pagetables: 7400 kB                                    • Before DB Startup PageTables: 7748 kB
          • After DB Startup Pagetables: 652900 kB                                   • After DB Startup Pagetables: 21288 kB
          • After 200 PQ Slave run query                                             • After 200 PQ slaves run query
            Pagetables: 6189248k                                                       Pagetables: 80564 kB
          • Time to complete: 00:10:23.60                                            • Time to complete: 00:00:18.77




11   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Hugepages and Transparent Hugepages
         Performance for DB and Middleware Applications

            “Regular” Hugepages [Ref. Doc ID 749851.1]
                     – Reduce footprint of individual Oracle database connections.
                     – Increase performance and scalability.
                     – Requires manual tuning after SGA changes, and does not work with AMM.
            Transparent Hugepages
                     – Transparent hugepages do not help the RDBMS use case.
                     – Auto-allocate hugepages for large memory allocations. Great for
                             Java/middleware/applications.
                     – New for UEK and OL6!


12   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Issue: Slow Performance at 95% RAM
         OL5 Specific issue with large memory allocation

           Symptoms
                     – System is swapping and shows low free/cached memory
                     – Reduced system performance
           Cause: Usually the kernel is hogging CPU in try_to_free_pages from
            pagecache, inactive lists.
           Solution
                     – Ensure you are running a shrink_zone patched kernel: UEK, OL6, or
                            OL5+BUG6086839
                     – If system is swapping but performance is OK, get more RAM.


13   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Issue: System Swapping with Free Memory
         NUMA Specific Problem

             Symptom: System starts to swap while reporting free RAM
                      – vmstat reports free memory.
                      – dmesg has “order 5 allocation failed” messages.
                      – If <5 order allocations are failing, there are larger issues
             Cause: Memory Fragmentation. On NUMA systems, caused by
              fragmentation of node-local memory for kernel applications.
             Solutions:
                      – Disable NUMA
                      – Decrease MTU size if using jumboframes


14   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Memory Accounting
         What is Linux Doing With My Memory?

            Free = Cached + Free
                     – All free space on Linux is used for pagecache.
                     – This behavior cannot be disabled.
            Process Shared Memory is hard to find in Linux
                     – RSS double counts shared memory, Total includes unmapped pages.
                     – Use /proc/<pid>/smaps to see real process memory usage.
            cgroups: New features in the latest kernels let you restrict RAM
                     – Useful to throttle pagecache use by backup processes



15   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Swap: What is it good for?

         Tuning Swap Space
 Swap is a highly contentious topic on
     Linux
         – Benefit: Allows “room to grow” for
                inadequately sized systems.
         – Drawback: Much slower than memory
                access, often makes problems worse.
 Recommendation: Use swap, but
     ensure IO to swap disk is kept close to
     zero.

                                                                                     vmstat output


16   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Tips and Tricks:
         General Recommendations




17   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Other Configuration Trouble
         Assorted Common Configuration Issues

            Use UUID to Mount System Disks
                     – Symptom: System panics after upgrade
                     – Cause: New hardware, drivers, or kernel reorders device discovery
                     – Cautions: May not work with LVM snapshot
            NFS Locks Not Released on Reboot
                     – Cause: kernel and DNS have different hostnames
                     – Solution: ensure kernel hostname is fully qualified. See BUG 3156942.
            Cluster Reboots with OCFS2
                     – Cause: Network or Disk outages can cause OCFS2 to fence nodes
                     – Solution: Ensure OCFS2 timeouts are greater than storage/network failover
                             timeouts. Defaults may be too short for o2cb heartbeat.


18   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Performance Tuning Kernel Parameters
         System Scheduling
          vm.swappiness
                         100: Force aggressive swapping
                         0: Insurance against a backup process hogging all system memory
         Network Protocol Buffers
          net.core.wmem_default/max: Buffer size for outgoing network packets.
          net.core.rmem_default/max: If these values are set too small, system may discard TCP packets

         Memory Management
          vm.dirty_ratio: encourage frequent pagecache writeback
          vm.lowmem_reserve_ratio/vm.min_free_kbytes: reserve physical memory for kernel allocations




19   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Bugs Fixed in Enterprise Linux

         Oracle Linux 5 Bug Fixes

          Oracle finds and fixes critical
              bugs in Enterprise Linux.
                    –      Red Hat Compatible Kernel vs.
                           Oracle-Modified kernel
                    –      Install the Compatible Kernel for
                           bug-for-bug compatibility with RHEL

          Patches required for correct
              Oracle product operation




20   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Oracle Patches Linux
         Specially Tuned Linux Kernels for Customer Requirements

            Staying up-to-date with your Linux distribution is very important.
            Bug-Fixed Oracle Linux Kernel
            UEK: Unbreakable Enterprise Kernel
                      – Top Performing Kernel. World Record TPCC Benchmark.
                      – Provides OL6 performance on OL5 systems.
            Backporting of fixes is a temporary solution, not a permanent one.
                      – Always plan to update or ksplice to the latest kernel version.




21   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
UEK: Modern Linux for Oracle

         Fast, Modern, Reliable
      Get the latest in performance and
         features from Linux, tested by Oracle.
               –      All new kernel, optimized for Oracle.
      Stay closer to mainline Linux with
         patches to improve performance for
         Oracle workloads.
               –      All patches are open source and
                      submitted to mainline Linux
               –      Patches provided via RPM and via ksplice
               –      World Record TPCC Bencmark March
                      ‟12.




22   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Tips and Tricks:
         Diagnostics




23   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Issue: Post-Event Diagnostics
         What to do after a crash or hang?

            Hard hangs
                     – Panic, OOPS, nmi_watchdog
                     – “Spontaneous Reboot”
            Brownouts
                     – Performance Degredation


            Cluster Scenarios
                     – Network or Disk may have gone away, triggering the fence
                     – Need to maintain crash data in the event of loss of net/disk
                     – Ensure timeouts (like OCFS2) are set correctly



24   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Two Kinds of Critical OS Logging

          Continuous OS Logging                                                      Panic/Hang Event Logging
           oswatcher continuous logging collected                                    serial console or netconsole should always
            timestamped snapshots of system commands:                                  be set up for any production system. No
                          ps, top                                                     exceptions.

                          slabinfo, meminfo                                          Consoles also preserve sysrq data.

                          vmstat, mpstat, iostat                                     kdump system memory image collection.

           Other tools can be employed as well, like sar
            or collectl.




25   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Console Logs
         Finding Faults if Disk or Networking Fail
            Kernel Messages may not be available after a crash
              – Serial Consoles are proven technology for preserving console output
            How to capture Serial output:
              – Reliable ways to capture serial output:
                   ILOM virtual console
                   Serial-Over-Lan BIOS config
                   Inexpensive DB9-USB converter or Serial Concentrator
              – Unreliable ways to capture serial output:
                   Physically attached terminal with „setterm –blank 0‟ and system not configured to reboot
                   netconsole (can be difficult to configure, and subject to network outages)
            Things to check:
              – Ensure Baud Rate is high enough (not 9600 baud)
              – For Virtual Console, ensure console history is setup to capture large amount of output



26   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Keyboard and Console Diagnostics
          SysRq Key Combinations for Diagnostics
 Magic SysRq Key                                                                      How to Invoke Magic SysRq…
     M: dump system memory statistics                                                 Console:
                                                                                      Alt + Sys Rq + <cmd>
     P, W: dump the stack for all processors                                          Serial Console:
                                                                                      <Break> <cmd>
     T: dump the kernel stack trace for all processes                                 Command Line:
                                                                                      echo t > /proc/sysrq-trigger
     C: Immediately cause a system crash                                              Oracle VM dom0:
                                                                                      xm sysrq <cmd> <domain ID>
 S … U… B: Emergency Sync all disks, Unmount disks,                                   Ensure kernel.sysrq = 1
 reboot.
     Some of these operations (like stack trace) dump a lot of                        These operations take full priority in the kernel. Do
     data (1MB or more!).                                                             not run them in your monitoring scripts, use
                                                                                      carefully!


27    Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Diagnostic/Destructive Kernel Parameters
         Enable Keyboard- and Console-based Debugging
          kernel.sysrq=1
                         Always have this set to enable debug commands
         System-wide Events
          panic_on_oom: Panic for Out of Memory condition
                         (Alternative would be to kill the high memory process)
          panic_on_oops: Panic for system problems
                         (off: some modules may survive a panic, but system state is inconsistent)
         Per-Process Events
          hung_task_timeout: Enable warning if process not scheduled for (timeout) second.
                         Can cause a lot of log messages, not usually useful
          hung_task_panic: Cause a stack trace and system panic if the timeout is hit
                         Can be useful for debugging. Not good to set by default.



28   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Crash Kernel Memory Snapshots
         Set Your System to Automatically Dump Core

           kdump: uses Linux kexec function to save kernel stacks after a panic
                     – Only way to get diagnostic data if disk or network are not available.
                     – Reboots the system into a protected memory area to save crashed kernel
           Very common errors:
                     – Not testing kdump: Requires specific memory tuning (crashkernel=) and
                       also requires specific HBA or network drivers
                     – Have dedicated space for crash dumps. Preferably not in your root
                       partition. Remember, vmcore == physical memory size.
                     – Local disk is faster and more reliable than network dumps.
                     – Use gzip or `makedumpfile` to compress cores prior to upload


29   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Reading a Kernel Core

         Using the Crash Utility

 Crash `bt` backtrace or SysRq-T stack
         – Get debug symbols from oss.oracle.com

 Red flags:
         –      Many processes in D state (IO)
         –      Many processes in same kernel routine
                (contention?)
 Caution: Stack traces can be 1M or
     greater. Don‟t do this frequently.

                                                                                     dmesg output after SysRq-T


30   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Diagnostic Tools for Brownouts
         Getting more out of your diagnostic tools


            strace -ttT: Diagnose slow processes
                     – Automatically timestamp system calls
                     – Useful for diagnosing specific process syscall latency
                     – Also helpful to determine if a problem is in kernel or usermode
            Crash utility on Virtual machines
                     – `xm dump-core` takes a noninvasive kernel snapshot of a system
                     – Provides memory, stack traces, and kernel logs




31   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Tips and Tricks:
         Ksplice




32   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Issue: Old Kernels
         Newer Kernel Releases Fix Your Bugs


            Symptom: Customer systems are encountering known/fixed issues
                     – Examples: tcp window_size, shrink_zone, etc.
                     – Use new kernels for new features: NFSv4, dtrace, btrfs.
            Cause: Older kernels are not „stable‟. New kernels fix bugs.
            Solutions:
                     – Implement a periodic update schedule for kernel and OS packages, or…
                     – Use ksplice to stay up to date with patches




33   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Ksplice: Rebootless Kernel Patching
         Zero Downtime Patching for Bugs and Security Updates


            Ksplice keeps your system up to date
                     – Integrated with ULN
                     – Now available in online and offline modes
            Using Ksplice for Diagnostics and Patching
                     – Real-World NFS Example




34   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Summary


          Key Linux Tips and Tricks
                     – kexec and oswatcher

          Common Issues
                     – memory management and configuration

          Diagnostic Tools and Use Cases
          Ksplice in the Datacenter


35   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
ORACLE LINUX PAVILION


                                                                                     Visit our partners and
                                                                                     don‟t miss these events
                                                                                     sponsored by QLogic
                                                                                        Smoothie Bar on
                                                                                         Monday, Oct 1st, 2:30-
                                                                                         5:30pm
                                                                                        Ice Cream Social on
                                                                                         Wednesday, Oct 3rd, 1-
                                                                                         2pm



36   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Oracle Linux Sessions
            Tuesday, Oct 2nd

              Oracle Linux TRACK SESSIONS


                                            General Session: Oracle Linux Strategy
                                            and Roadmap
              GEN8726                                                                10:15 AM   Moscone South - 103
                                            Speakers: Wim Coekaerts and Monica
                                            Kumar, Oracle


                                            Top Technical Tips for Automatic and
                                            Secure Oracle Linux Deployments
              CON8731                       Speakers: Lenz Grimmer, Oracle, Martin   11:45 AM   Moscone South - 270
                                            Breslin, SEI Global, Ed Bailey,
                                            Transunion




37   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Oracle Linux Sessions
            Wednesday, Oct 3rd
               Oracle Linux TRACK SESSION

                                                Why Switch to Oracle Linux?
                                                                                                Moscone South -
                      CON8729                   Speakers: Monica Kumar, Mike         3:30 PM
                                                                                                     270
                                                Radomski, SUNY



               HANDS ON LABS
                       HOL9383                  Oracle Linux Package                 10:15 AM    Marriot Salon
                                                Management: Configuring and                      14/15 YB level
                                                Enabling Services
                       HOL9384                  Oracle Linux Storage                 11:45 AM    Marriot Salon
                                                Management with LVM and                          14/15 YB level
                                                Device-Mapper




38   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Oracle Linux Sessions
            Thursday, Oct 4th


               HANDS ON LABS

                                              Oracle Linux Package
                                                                                                Marriot Salon 14/15
               HOL9383                        Management: Configuring and            12:45 PM
                                                                                                YB level
                                              Enabling Services
                                              Oracle Linux Storage
                                                                                                Marriot Salon 14/15
               HOL9384                        Management with LVM and Device-        2:15 PM
                                                                                                YB level
                                              Mapper




39   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
NEW: Oracle Linux Curriculum Footprint
             Oracle Linux Training from Oracle University


             Unix/Linux Essentials                                                        Oracle Linux System Administration
             Instructor-led and Live virtual                                              Instructor-led and Live virtual


               This Oracle Linux System Administration course teaches
               you all the essential system administration skills and includes
               key information specific to Oracle Linux: Unbreakable Enterprise
               Kernel, Ksplice, ULN, and other key features
                                                                                      Visit:
                                                                            oracle.com/education/linux


40   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Resources
         Join our communities




 @ORCL_Linux                                           Facebook.com/                 Blogs.oracle.com    Oracle Linux     YouTube.com/
                                                        OracleLinux                       /linux        Experts Group   oraclelinuxchannel

                                                                                 Visit
                                                                           Oracle.com/linux
                                                                         Download for FREE
                                                                      edelivery.oracle.com/linux

41   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
Graphic Section Divider




42   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
The preceding is intended to outline our general product direction. It is intended
         for information purposes only, and may not be incorporated into any contract.
         It is not a commitment to deliver any material, code, or functionality, and should
         not be relied upon in making purchasing decisions. The development, release,
         and timing of any features or functionality described for Oracle‟s products
         remains at the sole discretion of Oracle.




43   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
44   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle
45   Copyright © 2012, Oracle and/or its affiliates. All rights reserved.   Oracle

More Related Content

PDF
Why new hardware may not make Oracle databases faster
PDF
Performance Whack A Mole
PPTX
Sun Oracle Exadata Technical Overview V1
PDF
12 things Oracle DBAs must know about SQL
PDF
Whitepaper: Running Oracle e-Business Suite Database on Oracle Database Appli...
PDF
How to configure SQL Server for SSDs and VMs
PDF
Running E-Business Suite Database on Oracle Database Appliance
DOC
Oracle Database Appliance - RAC in a box Some strings attached
Why new hardware may not make Oracle databases faster
Performance Whack A Mole
Sun Oracle Exadata Technical Overview V1
12 things Oracle DBAs must know about SQL
Whitepaper: Running Oracle e-Business Suite Database on Oracle Database Appli...
How to configure SQL Server for SSDs and VMs
Running E-Business Suite Database on Oracle Database Appliance
Oracle Database Appliance - RAC in a box Some strings attached

What's hot (20)

PDF
MySQL Performance Tuning: The Perfect Scalability (OOW2019)
PPT
Sun Oracle Exadata V2 For OLTP And DWH
PDF
PostgreSQL and Benchmarks
PDF
How to configure SQL Server like a pro
PPTX
Optimizing Oracle databases with SSD - April 2014
PDF
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
PDF
MySQL Monitoring 101
PPTX
Hadoop/HBase POC framework
PDF
PostgreSQL on Solaris
PDF
Apache Spark At Scale in the Cloud
PPTX
Oracle Exadata X2-8: A Critical Review
PDF
Best Practices with PostgreSQL on Solaris
PPTX
Exadata Backup
PDF
Drilling Deep Into Exadata Performance
PPTX
Making the most of ssd in oracle11g
PPTX
Oracle Database Appliance RAC in a box Some Strings Attached
PPTX
LVOUG meetup #2 - Forcing SQL Execution Plan Instability
PDF
Cloud Consolidation with Oracle (RAC) - How much is too much?
PDF
Oow Ppt 2
PPTX
Automating Yourself Out of Trouble
MySQL Performance Tuning: The Perfect Scalability (OOW2019)
Sun Oracle Exadata V2 For OLTP And DWH
PostgreSQL and Benchmarks
How to configure SQL Server like a pro
Optimizing Oracle databases with SSD - April 2014
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Monitoring 101
Hadoop/HBase POC framework
PostgreSQL on Solaris
Apache Spark At Scale in the Cloud
Oracle Exadata X2-8: A Critical Review
Best Practices with PostgreSQL on Solaris
Exadata Backup
Drilling Deep Into Exadata Performance
Making the most of ssd in oracle11g
Oracle Database Appliance RAC in a box Some Strings Attached
LVOUG meetup #2 - Forcing SQL Execution Plan Instability
Cloud Consolidation with Oracle (RAC) - How much is too much?
Oow Ppt 2
Automating Yourself Out of Trouble
Ad

Similar to Debugging and Configuration Best Practices for Oracle Linux (20)

PPTX
Spark Tips & Tricks
PDF
Session 307 ravi pendekanti engineered systems
PDF
Tx lf propercareandfeedmysql
PDF
MySQL 5.7 -- SCaLE Feb 2014
PDF
Hadoop Operations for Production Systems (Strata NYC)
PDF
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
PDF
4 facing explosive data growth five ways to optimize storage for oracle datab...
PPT
Mysql talk
PPTX
Handling Massive Writes
PDF
Colvin exadata mistakes_ioug_2014
PPT
Life After Sharding: Monitoring and Management of a Complex Data Cloud
PDF
Java Memory Hogs.pdf
PDF
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
PDF
MySql's NoSQL -- best of both worlds on the same disks
PDF
[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama
PDF
MySQL 5.7 NEW FEATURES, BETTER PERFORMANCE, AND THINGS THAT WILL BREAK -- Mid...
PDF
Running your Java EE applications in the Cloud
PDF
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
PPTX
Oracle Performance On Linux X86 systems
PDF
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Spark Tips & Tricks
Session 307 ravi pendekanti engineered systems
Tx lf propercareandfeedmysql
MySQL 5.7 -- SCaLE Feb 2014
Hadoop Operations for Production Systems (Strata NYC)
PhpTek Ten Things to do to make your MySQL servers Happier and Healthier
4 facing explosive data growth five ways to optimize storage for oracle datab...
Mysql talk
Handling Massive Writes
Colvin exadata mistakes_ioug_2014
Life After Sharding: Monitoring and Management of a Complex Data Cloud
Java Memory Hogs.pdf
VMworld 2013: Just Because You Could, Doesn't Mean You Should: Lessons Learne...
MySql's NoSQL -- best of both worlds on the same disks
[B34] MySQL最新ロードマップ – MySQL 5.7とその先へ by Ryusuke Kajiyama
MySQL 5.7 NEW FEATURES, BETTER PERFORMANCE, AND THINGS THAT WILL BREAK -- Mid...
Running your Java EE applications in the Cloud
Santo Leto - MySQL Connect 2012 - Getting Started with Mysql Cluster
Oracle Performance On Linux X86 systems
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Ad

More from Terry Wang (14)

PDF
Introduction to containers, k8s, Microservices & Cloud Native
PDF
Btrfs by Chris Mason
PDF
Oracle Linux Nov 2011 Webcast
PDF
RHEL roadmap
PDF
Oracle Buys Ksplice
PDF
Eliminating Silent Data Corruption with Oracle Linux
PDF
我的 Ubuntu 之旅
PDF
Get the Facts: Oracle's Unbreakable Enterprise Kernel
PDF
Git 101 tutorial presentation
PDF
PDF
Git 入门 与 实践
PDF
Git 入门与实践
PDF
WCI 10gR3 overview
PPT
ALUI 6.5
Introduction to containers, k8s, Microservices & Cloud Native
Btrfs by Chris Mason
Oracle Linux Nov 2011 Webcast
RHEL roadmap
Oracle Buys Ksplice
Eliminating Silent Data Corruption with Oracle Linux
我的 Ubuntu 之旅
Get the Facts: Oracle's Unbreakable Enterprise Kernel
Git 101 tutorial presentation
Git 入门 与 实践
Git 入门与实践
WCI 10gR3 overview
ALUI 6.5

Debugging and Configuration Best Practices for Oracle Linux

  • 1. 1 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 2. Debugging and Configuration Best Practices for Oracle Linux Greg Marsden Senior Director, Linux and Virtualization 2 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 3. Agenda  Key Linux Tips and Tricks  Common Issues  Diagnostic Tools and Use Cases  Do it Yourself Debugging  Ksplice in the Datacenter 3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 4. Tips and Tricks: Key Points 4 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 5. Key Linux Tips and Tricks Kernel Tuning: Oracle Preinstall RPM Best Performance Diagnostic Tools: and Reliability kdump and oswatcher 5 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 6. Oracle Preinstall Package and Templates Configure Oracle Products Automatically  oracle-rdbms-server-11gR2-preinstall-1.0.6.el6.x86_64.rpm  Per-Product Preconfiguration Package – Based on Validated Configuration‟s real world stack testing – Includes Product Release Notes recommendations – Installs necessary dependencies and kernel tuning parameters – Individual for each Oracle product  Oracle VM Template for Oracle RDBMS Server – Production-ready, installed virtual machine templates from eDelivery 6 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 7. System Diagnostics Critical Diagnostics Software should run at all times  oswatcher utility: Install and leave running to collect over-time information about system activity.  serial console or netconsole to remotely monitor system activity in the case of a disk, network or system outage.  kexec crash collection utilities to gather forensic information from malfunctioning systems. 7 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 8. Tips and Tricks: Memory Management 8 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 9. “Help! my system has 250 GB of RAM I‟m running out of memory! My consultants are telling me we can‟t scale with a 120GB SGA and this many connections, but I can‟t fit any more RAM in this system.” Anonymous DBA Oracle User 9 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 10. Issue: Not using Hugepages Frequent Issue I found the following: . • Symptom: Out of Memory Errors, slow 13:09:19 57591060k free 159 client connections performance. Detected via 13:26:01 26189944k free 1826 client connections oswatcher. 13:32:31 15547144k free 2024 client connections 13:57:00 467048k free 2037 client connections • Cause: SGA mapped in 4k pages (here is where we begin swapping memory to disk) instead of 2MB I also found this: • Solution: Use Hugepages zzz ***Fri Aug 9 13:23:22 PDT 2011 • Hugepages are faster. MemTotal: 250 GB • Hugepages are “pinned” and won‟t MemFree: 464 MB be swapped. PageTables: 112 GB 10 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 11. Performance with Hugepages Without Hugepages With Hugepages • 200 Connections to a 12.9GB SGA • 200 Connections to a 12.9GB SGA • Before DB Startup Pagetables: 7400 kB • Before DB Startup PageTables: 7748 kB • After DB Startup Pagetables: 652900 kB • After DB Startup Pagetables: 21288 kB • After 200 PQ Slave run query • After 200 PQ slaves run query Pagetables: 6189248k Pagetables: 80564 kB • Time to complete: 00:10:23.60 • Time to complete: 00:00:18.77 11 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 12. Hugepages and Transparent Hugepages Performance for DB and Middleware Applications  “Regular” Hugepages [Ref. Doc ID 749851.1] – Reduce footprint of individual Oracle database connections. – Increase performance and scalability. – Requires manual tuning after SGA changes, and does not work with AMM.  Transparent Hugepages – Transparent hugepages do not help the RDBMS use case. – Auto-allocate hugepages for large memory allocations. Great for Java/middleware/applications. – New for UEK and OL6! 12 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 13. Issue: Slow Performance at 95% RAM OL5 Specific issue with large memory allocation  Symptoms – System is swapping and shows low free/cached memory – Reduced system performance  Cause: Usually the kernel is hogging CPU in try_to_free_pages from pagecache, inactive lists.  Solution – Ensure you are running a shrink_zone patched kernel: UEK, OL6, or OL5+BUG6086839 – If system is swapping but performance is OK, get more RAM. 13 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 14. Issue: System Swapping with Free Memory NUMA Specific Problem  Symptom: System starts to swap while reporting free RAM – vmstat reports free memory. – dmesg has “order 5 allocation failed” messages. – If <5 order allocations are failing, there are larger issues  Cause: Memory Fragmentation. On NUMA systems, caused by fragmentation of node-local memory for kernel applications.  Solutions: – Disable NUMA – Decrease MTU size if using jumboframes 14 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 15. Memory Accounting What is Linux Doing With My Memory?  Free = Cached + Free – All free space on Linux is used for pagecache. – This behavior cannot be disabled.  Process Shared Memory is hard to find in Linux – RSS double counts shared memory, Total includes unmapped pages. – Use /proc/<pid>/smaps to see real process memory usage.  cgroups: New features in the latest kernels let you restrict RAM – Useful to throttle pagecache use by backup processes 15 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 16. Swap: What is it good for? Tuning Swap Space  Swap is a highly contentious topic on Linux – Benefit: Allows “room to grow” for inadequately sized systems. – Drawback: Much slower than memory access, often makes problems worse.  Recommendation: Use swap, but ensure IO to swap disk is kept close to zero. vmstat output 16 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 17. Tips and Tricks: General Recommendations 17 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 18. Other Configuration Trouble Assorted Common Configuration Issues  Use UUID to Mount System Disks – Symptom: System panics after upgrade – Cause: New hardware, drivers, or kernel reorders device discovery – Cautions: May not work with LVM snapshot  NFS Locks Not Released on Reboot – Cause: kernel and DNS have different hostnames – Solution: ensure kernel hostname is fully qualified. See BUG 3156942.  Cluster Reboots with OCFS2 – Cause: Network or Disk outages can cause OCFS2 to fence nodes – Solution: Ensure OCFS2 timeouts are greater than storage/network failover timeouts. Defaults may be too short for o2cb heartbeat. 18 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 19. Performance Tuning Kernel Parameters System Scheduling  vm.swappiness  100: Force aggressive swapping  0: Insurance against a backup process hogging all system memory Network Protocol Buffers  net.core.wmem_default/max: Buffer size for outgoing network packets.  net.core.rmem_default/max: If these values are set too small, system may discard TCP packets Memory Management  vm.dirty_ratio: encourage frequent pagecache writeback  vm.lowmem_reserve_ratio/vm.min_free_kbytes: reserve physical memory for kernel allocations 19 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 20. Bugs Fixed in Enterprise Linux Oracle Linux 5 Bug Fixes  Oracle finds and fixes critical bugs in Enterprise Linux. – Red Hat Compatible Kernel vs. Oracle-Modified kernel – Install the Compatible Kernel for bug-for-bug compatibility with RHEL  Patches required for correct Oracle product operation 20 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 21. Oracle Patches Linux Specially Tuned Linux Kernels for Customer Requirements  Staying up-to-date with your Linux distribution is very important.  Bug-Fixed Oracle Linux Kernel  UEK: Unbreakable Enterprise Kernel – Top Performing Kernel. World Record TPCC Benchmark. – Provides OL6 performance on OL5 systems.  Backporting of fixes is a temporary solution, not a permanent one. – Always plan to update or ksplice to the latest kernel version. 21 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 22. UEK: Modern Linux for Oracle Fast, Modern, Reliable  Get the latest in performance and features from Linux, tested by Oracle. – All new kernel, optimized for Oracle.  Stay closer to mainline Linux with patches to improve performance for Oracle workloads. – All patches are open source and submitted to mainline Linux – Patches provided via RPM and via ksplice – World Record TPCC Bencmark March ‟12. 22 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 23. Tips and Tricks: Diagnostics 23 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 24. Issue: Post-Event Diagnostics What to do after a crash or hang?  Hard hangs – Panic, OOPS, nmi_watchdog – “Spontaneous Reboot”  Brownouts – Performance Degredation  Cluster Scenarios – Network or Disk may have gone away, triggering the fence – Need to maintain crash data in the event of loss of net/disk – Ensure timeouts (like OCFS2) are set correctly 24 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 25. Two Kinds of Critical OS Logging Continuous OS Logging Panic/Hang Event Logging  oswatcher continuous logging collected  serial console or netconsole should always timestamped snapshots of system commands: be set up for any production system. No  ps, top exceptions.  slabinfo, meminfo  Consoles also preserve sysrq data.  vmstat, mpstat, iostat  kdump system memory image collection.  Other tools can be employed as well, like sar or collectl. 25 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 26. Console Logs Finding Faults if Disk or Networking Fail  Kernel Messages may not be available after a crash – Serial Consoles are proven technology for preserving console output  How to capture Serial output: – Reliable ways to capture serial output:  ILOM virtual console  Serial-Over-Lan BIOS config  Inexpensive DB9-USB converter or Serial Concentrator – Unreliable ways to capture serial output:  Physically attached terminal with „setterm –blank 0‟ and system not configured to reboot  netconsole (can be difficult to configure, and subject to network outages)  Things to check: – Ensure Baud Rate is high enough (not 9600 baud) – For Virtual Console, ensure console history is setup to capture large amount of output 26 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 27. Keyboard and Console Diagnostics SysRq Key Combinations for Diagnostics Magic SysRq Key How to Invoke Magic SysRq… M: dump system memory statistics Console: Alt + Sys Rq + <cmd> P, W: dump the stack for all processors Serial Console: <Break> <cmd> T: dump the kernel stack trace for all processes Command Line: echo t > /proc/sysrq-trigger C: Immediately cause a system crash Oracle VM dom0: xm sysrq <cmd> <domain ID> S … U… B: Emergency Sync all disks, Unmount disks, Ensure kernel.sysrq = 1 reboot. Some of these operations (like stack trace) dump a lot of These operations take full priority in the kernel. Do data (1MB or more!). not run them in your monitoring scripts, use carefully! 27 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 28. Diagnostic/Destructive Kernel Parameters Enable Keyboard- and Console-based Debugging  kernel.sysrq=1  Always have this set to enable debug commands System-wide Events  panic_on_oom: Panic for Out of Memory condition  (Alternative would be to kill the high memory process)  panic_on_oops: Panic for system problems  (off: some modules may survive a panic, but system state is inconsistent) Per-Process Events  hung_task_timeout: Enable warning if process not scheduled for (timeout) second.  Can cause a lot of log messages, not usually useful  hung_task_panic: Cause a stack trace and system panic if the timeout is hit  Can be useful for debugging. Not good to set by default. 28 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 29. Crash Kernel Memory Snapshots Set Your System to Automatically Dump Core  kdump: uses Linux kexec function to save kernel stacks after a panic – Only way to get diagnostic data if disk or network are not available. – Reboots the system into a protected memory area to save crashed kernel  Very common errors: – Not testing kdump: Requires specific memory tuning (crashkernel=) and also requires specific HBA or network drivers – Have dedicated space for crash dumps. Preferably not in your root partition. Remember, vmcore == physical memory size. – Local disk is faster and more reliable than network dumps. – Use gzip or `makedumpfile` to compress cores prior to upload 29 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 30. Reading a Kernel Core Using the Crash Utility  Crash `bt` backtrace or SysRq-T stack – Get debug symbols from oss.oracle.com  Red flags: – Many processes in D state (IO) – Many processes in same kernel routine (contention?)  Caution: Stack traces can be 1M or greater. Don‟t do this frequently. dmesg output after SysRq-T 30 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 31. Diagnostic Tools for Brownouts Getting more out of your diagnostic tools  strace -ttT: Diagnose slow processes – Automatically timestamp system calls – Useful for diagnosing specific process syscall latency – Also helpful to determine if a problem is in kernel or usermode  Crash utility on Virtual machines – `xm dump-core` takes a noninvasive kernel snapshot of a system – Provides memory, stack traces, and kernel logs 31 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 32. Tips and Tricks: Ksplice 32 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 33. Issue: Old Kernels Newer Kernel Releases Fix Your Bugs  Symptom: Customer systems are encountering known/fixed issues – Examples: tcp window_size, shrink_zone, etc. – Use new kernels for new features: NFSv4, dtrace, btrfs.  Cause: Older kernels are not „stable‟. New kernels fix bugs.  Solutions: – Implement a periodic update schedule for kernel and OS packages, or… – Use ksplice to stay up to date with patches 33 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 34. Ksplice: Rebootless Kernel Patching Zero Downtime Patching for Bugs and Security Updates  Ksplice keeps your system up to date – Integrated with ULN – Now available in online and offline modes  Using Ksplice for Diagnostics and Patching – Real-World NFS Example 34 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 35. Summary  Key Linux Tips and Tricks – kexec and oswatcher  Common Issues – memory management and configuration  Diagnostic Tools and Use Cases  Ksplice in the Datacenter 35 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 36. ORACLE LINUX PAVILION Visit our partners and don‟t miss these events sponsored by QLogic  Smoothie Bar on Monday, Oct 1st, 2:30- 5:30pm  Ice Cream Social on Wednesday, Oct 3rd, 1- 2pm 36 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 37. Oracle Linux Sessions Tuesday, Oct 2nd Oracle Linux TRACK SESSIONS General Session: Oracle Linux Strategy and Roadmap GEN8726 10:15 AM Moscone South - 103 Speakers: Wim Coekaerts and Monica Kumar, Oracle Top Technical Tips for Automatic and Secure Oracle Linux Deployments CON8731 Speakers: Lenz Grimmer, Oracle, Martin 11:45 AM Moscone South - 270 Breslin, SEI Global, Ed Bailey, Transunion 37 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 38. Oracle Linux Sessions Wednesday, Oct 3rd Oracle Linux TRACK SESSION Why Switch to Oracle Linux? Moscone South - CON8729 Speakers: Monica Kumar, Mike 3:30 PM 270 Radomski, SUNY HANDS ON LABS HOL9383 Oracle Linux Package 10:15 AM Marriot Salon Management: Configuring and 14/15 YB level Enabling Services HOL9384 Oracle Linux Storage 11:45 AM Marriot Salon Management with LVM and 14/15 YB level Device-Mapper 38 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 39. Oracle Linux Sessions Thursday, Oct 4th HANDS ON LABS Oracle Linux Package Marriot Salon 14/15 HOL9383 Management: Configuring and 12:45 PM YB level Enabling Services Oracle Linux Storage Marriot Salon 14/15 HOL9384 Management with LVM and Device- 2:15 PM YB level Mapper 39 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 40. NEW: Oracle Linux Curriculum Footprint Oracle Linux Training from Oracle University Unix/Linux Essentials Oracle Linux System Administration Instructor-led and Live virtual Instructor-led and Live virtual This Oracle Linux System Administration course teaches you all the essential system administration skills and includes key information specific to Oracle Linux: Unbreakable Enterprise Kernel, Ksplice, ULN, and other key features Visit: oracle.com/education/linux 40 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 41. Resources Join our communities @ORCL_Linux Facebook.com/ Blogs.oracle.com Oracle Linux YouTube.com/ OracleLinux /linux Experts Group oraclelinuxchannel Visit Oracle.com/linux Download for FREE edelivery.oracle.com/linux 41 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 42. Graphic Section Divider 42 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 43. The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle‟s products remains at the sole discretion of Oracle. 43 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 44. 44 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle
  • 45. 45 Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Oracle

Editor's Notes

  • #32: Thanks for attending.For additional information and resources, you can visit us at: oracle.com/linux and join our social media sites to get day to day updates