Tag Archives: darwin

How to monitor NVMe drives in the OSX

NVMe support in OSX

After upgrade to the latest Macbook Pro i found that smartctl is not able to find any smart capable drive. This is because Apple replaces SATA SSD with NVMe one and old SMART API is not working (it is very ATA specific). Smartmontools itself includes NVMe support for the Linux, Windows and FreeBSD, so i decided to try to add it to the Darwin as well. However it was not as easy as expected – Apple did not published any source code or documentation about NVMe device support or monitoring. Moreover – there is no any tool in OSX to show such statistic and old tools from SDK are useless because of API Change.

Starting to search for the API provider

After looking on the file tree i have found good candidate: /System/Library/Extensions/NVMeSMARTLib.plugin. Its done more or less similar to the /System/Library/Extensions/SMARTLib.plugin/ which provides SATA/ATA SMART support. As i mentioned – there is no documentation, so i had to use otool, nm and lldb to deal with it. As expected, it was found that API is similar to the ATA one. You can get list of the symbols and functions using this command: nm NVMeSMARTLib | c++filt -p -i. So i tried to connect to it using modified example from SDK for the SMART. Tricky part was to find kIONVMeSMARTUserClientTypeID and kIONVMeSMARTInterfaceID values which are using by CFPLUGIN infrastructure in the IOKit to initialize API interface. Fortunately library comparing this data during runtime, so with disasm i been able to find them. After successful connect to the interface i been able to reconstruct missing headers and use some of the functions (see below)

What is working and what is not.

The most important functions SMARTReadData and GetIdentifyData are working and result is provided in the structures matching with NVMe standard. I was not able to get GetLogPage function running, probably it expects pointer to some structure with defined data. If apple will release any consumer of it it would be easy to find this out.

Also there are some other, unknown functions in this API: GetFieldCounters (always returns error), ScheduleBGRefresh (no parameters, returns ok), GetSystemCounters and GetAlgorithmCounters (some driver info? or vendor-specific log pages?). Another interesting finding was string “Sandisk 401Z128G-4p-MLC” in the SMART log page, so possibly this NVMe is originally from this vendor.

SmartMontools support

I am working to add limited NVMe support to the smartctl and smartd for OSX. Now i already have working prototype, but need to cleanup and refactor some code. I am planning to add this before the next release. Below is an output from my disk:

smartctl 6.6 2017-09-14 r4434M [Darwin 16.7.0 x86_64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===

Model Number:                       APPLE SSD AP0512J
Serial Number:                      XXXXXX
Firmware Version:                   16.14.01
PCI Vendor/Subsystem ID:            0x106b
IEEE OUI Identifier:                0x000502
Controller ID:                      0
Number of Namespaces:               2
Local Time is:                      Wed Sep 20 08:56:36 2017 CEST
Firmware Updates (0x02):            1 Slot
Optional Admin Commands (0x0004):   Frmw_DL
Optional NVM Commands (0x0004):     DS_Mngmt
Maximum Data Transfer Size:         256 Pages

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     0.00W       -        -    0  0  0  0        0       0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02, NSID 0x0)
Critical Warning:                   0x00
Temperature:                        33 Celsius
Available Spare:                    90%
Available Spare Threshold:          2%
Percentage Used:                    0%
Data Units Read:                    19,311,330 [9.88 TB]
Data Units Written:                 11,653,167 [5.96 TB]
Host Read Commands:                 50,388,833
Host Write Commands:                37,404,327
Controller Busy Time:               0
Power Cycles:                       2,320
Power On Hours:                     23
Unsafe Shutdowns:                   7
Media and Data Integrity Errors:    0
Error Information Log Entries:      0

Reconstructed API

If you want to play with the API yourself – you can use this header. Please let me know if you found how to use GetLogPage or any other useful information:

// NVMe definitions, non documented, experimental

// Constant to init driver
#define kIONVMeSMARTUserClientTypeID       CFUUIDGetConstantUUIDWithBytes(NULL,      \
                                        0xAA, 0x0F, 0xA6, 0xF9, 0xC2, 0xD6, 0x45, 0x7F, 0xB1, 0x0B, \
                    0x59, 0xA1, 0x32, 0x53, 0x29, 0x2F)

// Constant to use plugin interface
#define kIONVMeSMARTInterfaceID        CFUUIDGetConstantUUIDWithBytes(NULL,                  \
                    0xcc, 0xd1, 0xdb, 0x19, 0xfd, 0x9a, 0x4d, 0xaf, 0xbf, 0x95, \
                    0x12, 0x45, 0x4b, 0x23, 0xa, 0xb6)

// interface structure, obtained using lldb, could be incomplete or wrong
typedef struct IONVMeSMARTInterface
{
        IUNKNOWN_C_GUTS;

        UInt16 version;
        UInt16 revision;

                // NVMe smart data, returns nvme_smart_log structure
        IOReturn ( *SMARTReadData )( void *  interface,
                                     struct nvme_smart_log * NVMeSMARTData );

                // NVMe IdentifyData, returns nvme_id_ctrl per namespace
        IOReturn ( *GetIdentifyData )( void *  interface,
                                      struct nvme_id_ctrl * NVMeIdentifyControllerStruct,
                                      unsigned int ns );

                // Always getting kIOReturnDeviceError
        IOReturn ( *GetFieldCounters )( void *   interface,
                                        char * FieldCounters );
                // Returns 0
        IOReturn ( *ScheduleBGRefresh )( void *   interface);

                // Always returns kIOReturnDeviceError, probably expects pointer to some
                // structure as an argument
        IOReturn ( *GetLogPage )( void *  interface, void * data, unsigned int, unsigned int);


                /* GetSystemCounters Looks like a table with an attributes. Sample result:

                0x101022200: 0x01 0x00 0x08 0x00 0x00 0x00 0x00 0x00
                0x101022208: 0x00 0x00 0x00 0x00 0x02 0x00 0x08 0x00
                0x101022210: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x101022218: 0x03 0x00 0x08 0x00 0xf1 0x74 0x26 0x01
                0x101022220: 0x00 0x00 0x00 0x00 0x04 0x00 0x08 0x00
                0x101022228: 0x0a 0x91 0xb1 0x00 0x00 0x00 0x00 0x00
                0x101022230: 0x05 0x00 0x08 0x00 0x24 0x9f 0xfe 0x02
                0x101022238: 0x00 0x00 0x00 0x00 0x06 0x00 0x08 0x00
                0x101022240: 0x9b 0x42 0x38 0x02 0x00 0x00 0x00 0x00
                0x101022248: 0x07 0x00 0x08 0x00 0xdd 0x08 0x00 0x00
                0x101022250: 0x00 0x00 0x00 0x00 0x08 0x00 0x08 0x00
                0x101022258: 0x07 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x101022260: 0x09 0x00 0x08 0x00 0x00 0x00 0x00 0x00
                0x101022268: 0x00 0x00 0x00 0x00 0x0a 0x00 0x04 0x00
                .........
                0x101022488: 0x74 0x00 0x08 0x00 0x00 0x00 0x00 0x00
                0x101022490: 0x00 0x00 0x00 0x00 0x75 0x00 0x40 0x02
                0x101022498: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                */
        IOReturn ( *GetSystemCounters )( void *  interface, char *, unsigned int *);


                /* GetAlgorithmCounters returns mostly 0
                0x102004000: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004008: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004010: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004018: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004020: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004028: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004038: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004040: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004048: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004050: 0x00 0x00 0x00 0x00 0x80 0x00 0x00 0x00
                0x102004058: 0x80 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004060: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004068: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004070: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004078: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004080: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004088: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004090: 0x00 0x01 0x00 0x00 0x00 0x00 0x00 0x00
                0x102004098: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00

                */
        IOReturn ( *GetAlgorithmCounters )( void *  interface, char *, unsigned int *);
} IONVMeSMARTInterface;


Advertisements
Tagged , , , ,

Smartmontools daily builds

Sometime i need to audit some servers and often smartmontools is very old, not installed at all (and repositories are broken) or not working for some reasons. Thats one of the reasons why http://builds.smartmontools.org was created. You can download latest SVN builds for the following systems:

  • Darwin (OSX) package, Mach-O universal binary with 2 architectures: i386+x86_64
  • Win32 installer (32 and 64 bit)
  • Linux: i686,x86_64,static and dynamic
  • Source code

Service is now in “experimental” status, please report any issues with it here or on https://smartmontools.org.

Tagged , , , ,