NBT Decoder/Encoder for PHP


Yep, it’s another binary format decoding project. I just about had my fill with nVault, but decided to give this a whack, since there doesn’t seem to be a working script for decoding the NBT format using PHP.

The Chase

As with the nVault project, if all you care about is the resulting code, it can be found in my Subversion repository for Minecraft, along with my other in-progress code.

July 30th, 2011: I’ve moved this project to a github repository. All future updates will happen there.

Prelude

NBT is the format used by Minecraft to store data such as world chunks, level information, etc. It’s a fairly simple format unto itself, but the process was much more complicated than expected, for PHP.

An NBT file is, to summarize, a GZIP-compressed binary data format which stores information in the binary form of data primitives. It supports a number of different types, as well as three distinct types of list. While not the most elegant format, it gets the job done, and offers a reasonable degree of flexibility.

Hurdles

Rather than describing my process step-by-step, I’ll just point out the major hurdles I encountered in this process. Hopefully this information will justify some of the seemingly-inelegant or inefficient aspects of my code, as well as teach you a little something about PHP’s strengths and limitations.

Pack & Unpack

Part of PHP’s standard library are two related functions called pack and unpack. Two sides to the same coin, one takes in scalar values and packs them into the specified binary format, while the other does the opposite, reading in binary data and casting it as a particular type.

These functions are (as the underlying source code indicates) stolen from Perl, in terms of concept and implementation. However, they aren’t complete. A few finer-grained packing codes are absent, which would otherwise have provided the ability to pack and unpack specific endiannesses for, for instance, signed shorts.

To get around this absence of functionality, I had to learn the way that these types were represented. Being a bit spoiled by loosely-typed scalar languages like PHP, I hadn’t had to understand how the data is actually laid out in memory. It took me a while to wrap my head around endianness representation and such.

Endianness

When storing data in memory which is larger than a byte, there are two major types of “endianness” which various operating systems will use. The first is Big-Endian, which means that the most significant byte is stored first; the opposite for Little-Endian, which stores the most significant byte last.

Most systems in use today use Little-Endian, but the NBT format specifically uses Big-Endian for all of its values. This causes the problem of needing to specify such explicitly for each unpacking and packing operation.

For signed shorts, ints and longs, I decided to largely ignore this problem, by unpacking the values as unsigned shorts, ints and longs, which unpack can handle with a specific endianness, and then calculate the signed version from that unsigned value.

For floats, I simply detected the system’s endianness, and reversed the byte order if necessary.

Integer Limit

In PHP, there is an explicit 32-bit limit for integers, even on 64-bit systems (I think). This issue is expected to be resolved in PHP 6, but that doesn’t help us much now. What this means is that long integers, which are 64-bit, can only be represented as signed, which fits the value in the 32-bits provided, and the signing is handled else-wise.

To get around this, I was forced to use GMP, which allows one to perform arithmetic operations on arbitrary-precision values (very big or very small) using strings as the allocation means. What this means for us is that we can store integers which are larger that 32 bits can represent, and perform math on them.

Conclusions

While this all seems very obvious now that I lay it out, there isn’t much documentation to be had on the subject, which I intend to remedy (being on the PHP doc team). It has taken me weeks of frustration, testing and dumb luck to come up with the solution you see here, which is probably why nobody has come out with anything similar yet (to my knowledge).

Right now I’m not going to license the code, and I’ll simply say that it’s public domain. If someone’s project can benefit from the code, have at ‘er. I don’t care to add any restrictions.

, ,

  1. #1 by MagicalTux on December 31, 2010 - 2:50 pm

    Hi,
    I also wrote a quick’n'dirty NBT decoder for a project here, which do not exactly behaves the same way as yours (first because I only needed to read, not to write).
    My code requires PHP to be running on a 64bits system (been a while since php’s int are natively 64bits on 64bits systems) because I was too lazy to use any MP library.
    I was too lazy to implement the float/double support so I took yours, however I wonder if it is really portable (anyway I don’t care much about floats either, just nice to have)

    • #2 by Justin Martin on December 31, 2010 - 7:55 pm

      Unless I’m mistaken, PHP cannot handle 64-bit integers even on 64-bit systems. It has to do with the way integers are handled in the PHP core. If it does work, then it’s probably wrought with issues.

      As for the floats and doubles in my code, those are two of the few data types which are handled entirely using native functionality, which will port perfectly across any system.

  2. #3 by Barry Carlyon on March 23, 2011 - 4:33 pm

    Just wanted to say many thanks for developing this.

    I was using a different class, that could just read and could not handle the large ints properly.

    But I am now using your class (since I need to write) and its works perfectly.

    I now have a nice Minecraft teleporter (pick a signpost from the map, and then I can put your player there next time you logon by writing to your player.dat)

    And I can just go php changespawn.php x y z and change my spawn without the clunkyness of opening a map editor, and not forgetting downloading the map and re uploading it again.

    So many thanks!

  3. #4 by Meppapza on April 6, 2011 - 6:53 am

    Hi there! please tell me, how can i use things like this? do you perhaps have an example script? I am new to using classes and I really want to learn how to use this..

    • #5 by Justin Martin on April 6, 2011 - 8:22 am

      It’s worth mentioning that the NBT format is now only used to store the chunk data itself in the Beta Level Format. When I wrote this post, the NBT format was the top-level format for all Alpha Level Format files.

      I’m in the process of developing a decoder/encoder for the Beta Level format, so check back in a while, and I’ll probably have finished it.

      • #6 by Meppapza on April 6, 2011 - 10:44 am

        I found that the inventory items in minecraft also had the NBT format… would this be the beta format as wel?

        What I want to do is write a mod for my SMF forum where my members can buy stuff of points they earn by posting.

  4. #7 by GiL_TheB on May 25, 2011 - 1:40 am

    ah great stuff you made, I use it to read/write *.shematic files and it works great, I just modify it a bit so it can ouput a string for a direct download feature. html editor>json>php>shematics file

  5. #8 by sfPlayer on July 18, 2011 - 12:00 am

    Is it allowed to redistribute your NBT class?

    I use it in a small quick&dirty id replacement script to fix updated content mod item ids in region files.

    • #9 by Justin Martin on July 18, 2011 - 5:48 am

      Yes, you may remix and redistribute this work, though I certainly wouldn’t mind attribution ;).

  6. #10 by Nenesse on July 29, 2011 - 5:54 am

    Hi, I have a problem with nbt.class.php

    you could see the error here: http://minecraft.nenesse.net/nbt.class.php

    I’m using EasyPHP and my server is on Windows 7.

    Do you know what is the problem?

    • #11 by Justin Martin on July 29, 2011 - 11:19 pm

      Hello Nenesse. The first couple of issues are a result of my having used calltime pass-by-reference. I’ll actually fix that bit. The second part is a result of your not having the GMP extension installed, which is required for this class to function.

  7. #12 by sfxworks on July 31, 2011 - 4:17 am

    This awesomeness can only be described as awesome.

  8. #13 by sfxworks on July 31, 2011 - 10:09 am

    Deprecated: Call-time pass-by-reference has been deprecated in C:\Server\wamp\www\Sandbox\NBTDecode\nbt.class.php on line 33
    Call Stack
    # Time Memory Function Location
    1 0.0013 670448 {main}( ) ..\test.php:0

    ( ! ) Deprecated: Call-time pass-by-reference has been deprecated in C:\Server\wamp\www\Sandbox\NBTDecode\nbt.class.php on line 113
    Call Stack
    # Time Memory Function Location
    1 0.0013 670448 {main}( ) ..\test.php:0

    • #14 by Justin Martin on August 2, 2011 - 6:15 pm

      Hi there. I’ve just fixed this issue in the github repository for the library, so you can grab a new copy if you would like.

  9. #15 by Stefan on August 8, 2011 - 6:09 am

    This is a downright awesome bit of code. I had to modify it to accept a string rather than loading from a file. Was quite a task because you were reading it right off the disk.
    Anyway; to say Thank you for this awesome-ness here’s a snippet of code that pulls the chunks from a region file (Beta 1.3+ map files) ready to be decoded by your function… Hope it helps.

    http://www.twosphere.com.au/download/regionfile.ph_

    • #16 by Justin Martin on August 8, 2011 - 9:45 am

      Hi Stefan. This looks great. If you were to put together an OOP-style bit of code to handle this, I’d gladly push it to the github repo.

  10. #17 by Artemis on September 3, 2011 - 9:28 am

    Hey guy,
    Is there any way to make your class working without the GMP extension ? My web hoster doesn’t support this… :(
    I just seen a fork of your original project here : http://pastebin.com/fwRwpfNs It perfectly decodes the NBT file, but make an error when I try to use the writeFile() function.

    Thanks for your help.

  11. #18 by sfxworks on September 6, 2011 - 4:18 pm

    Any way to have it stuff the entire nbt into a variable instead of doing a print_r?

    • #19 by Justin Martin on September 6, 2011 - 5:13 pm

      The test.php script is just showing that the NBT class works on your system. When you NBT::loadFile(), it loads the NBT into $nbt->root. You can then traverse that variable.

      • #20 by sfxworks on September 6, 2011 - 6:44 pm

        So something like http_build_query($nbt->root) would work? Since an nbt is an array within arrays I can send it as an array?

  12. #21 by sfxworks on September 7, 2011 - 7:44 am

    I have no idea…
    Im so tired. All I need to know is how to read the parts of a variable. I have this schematic file. Its able to read the thing just fine but I CANT for the life of me put it in a variable and read it out to flash.

    I must be doing it wrong. It does the whole

    Reading tag “WEOffsetX” at offset 13
    Reading in list of 0 tags of type 10.
    Reading tag “Entities” at offset 29
    Reading tag “Materials” at offset 45
    Reading tag “Height” at offset 64
    ext..
    Yet it doesn’t know what im saying when I do something like print_r($nbt->root=>”Schematic”); or echo($nbt->root['Height'])

    …I tried your traverse tag function but its not letting me use it since its a protected function. And even if I made it unprotected I have no idea what $fp is.

  13. #22 by sfxworks on September 7, 2011 - 11:47 am

    herp derp $nbt->root[0]['value'][everything else]
    Sorry to waste your time :)

  14. #23 by Darker on August 22, 2012 - 11:33 pm

    Thanks for this stuff, its great and avesomely easy to use!
    I’ve added key search function to find paths to the all ocurences of key with specifyed name:
    [code]public function findKeyPath($key,$tmp=NULL,&$path=array(),&$keys=array()) {
    if($tmp==NULL)
    $tmp = $this->root;
    for($i=0; $ifindKeyPath($key,$tmp[$i]["value"],$path,$keys);
    }else if($tmp[$i]["name"]==$key) {
    $keys[]=array_merge($path,(array)$i);
    }
    if($i==count($tmp)-1)
    unset($path[count($path)-1]);
    }
    return $keys;
    }[/code]
    I use it this way:
    $nbt->loadFile(“$home\\savename\\level.dat”);
    $keyp = $nbt->findKeyPath(“LevelName”);
    echo “\$nbt->root[".implode("][\"value\"][",$keyp[0]).”][\"value\"]“;
    Then I copy paste the output and I use it hardcoded. For replacing keys dynamically you must write some function that finds element at specified path.

(will not be published)
*