Tuesday, May 15, 2018

JSON Parsing with gron

While jq is powerful, its major shortcoming is that it requires one to know the JSON structure being parsed.  gron is less restrictive and can be combined easily with Linux tools, such as grep, sed, and awk to build very powerful parsing pipelines, without having to know exactly where to expect a particular structure or value.

Using the Right Tools

As a polyglot programmer I strive to employ the simplest approach, the best tool for the job.  I have parsed JSON in Java, Python, and Go, but I think too many times we ignore the UNIX/Linux tools (sed, awk, cut, etc.).  Too many programmers ignore these tools, and write hulking data parsers that are just overkill.  With gron, I find it easier to utilize these strong text editing, manipulation, and filtering tools.

Installing gron

Instructions can be found here for installing gron.  I used brew install gron.  And then, for reasons that will be apparent later, I added the following alias:
alias norg="gron --ungron".

Make JSON greppable

Obviously, being text-based, JSON is already "greppable".  However, the strength of gron comes from it's ability to split JSON into lines of what is referred to as "discrete assignments".  Given the JSON snippet below (from an aws ec2 CLI call):

  1.  {  
  2.     "Reservations": [  
  3.         {  
  4.             "OwnerId""<OWNER_ID>",   
  5.             "ReservationId""<RES_ID>",   
  6.             "Groups": [],   
  7.             "Instances": [  
  8.                 {  
  9.                     "Monitoring": {  
  10.                         "State""disabled"  
  11.                     },   
  12.                     "PublicDnsName""",   
  13.                     "State": {  
  14.                         "Code"16,   
  15.                         "Name""running"  
  16.                     },   
  17.                     "EbsOptimized"false,   
  18.                     "LaunchTime""2016-08-31T22:39:37.000Z",   
  19.                     "PublicIpAddress""<PUBLIC_IP>",   
  20.                     "PrivateIpAddress""<PRIVATE_IP>",   
  21.                     "ProductCodes": [],   
  22.                     "VpcId""<VPC_ID>",   
  23.                     "StateTransitionReason""",   
  24.                     "InstanceId""<ID>",   
  25.                     "ImageId""<AMI_ID>",   
  26.                     "PrivateDnsName""<PRIVATE_DNS_NAME>",   
  27.                     "KeyName""<KEY_NAME>",   
  28.                     "SecurityGroups": [...  


gron will parse (cat ~/ec2.json | gron) and convert the JSON to lines of discrete assignments:
  1. json = {};  
  2. json.Reservations = [];  
  3. json.Reservations[0] = {};  
  4. json.Reservations[0].Groups = [];  
  5. json.Reservations[0].Instances = [];  
  6. json.Reservations[0].Instances[0] = {};  
  7. json.Reservations[0].Instances[0].AmiLaunchIndex = 0;  
  8. json.Reservations[0].Instances[0].Architecture = "x86_64";  
  9. json.Reservations[0].Instances[0].BlockDeviceMappings = [];  
  10. json.Reservations[0].Instances[0].BlockDeviceMappings[0] = {};  
  11. json.Reservations[0].Instances[0].BlockDeviceMappings[0].DeviceName = "/dev/xvda";  
  12. json.Reservations[0].Instances[0].BlockDeviceMappings[0].Ebs = {};  
  13. json.Reservations[0].Instances[0].BlockDeviceMappings[0].Ebs.AttachTime = "2016-08-21T22:00:41.000Z";  
  14. json.Reservations[0].Instances[0].BlockDeviceMappings[0].Ebs.DeleteOnTermination = true;  
  15. json.Reservations[0].Instances[0].BlockDeviceMappings[0].Ebs.Status = "attached";  
  16. json.Reservations[0].Instances[0].BlockDeviceMappings[0].Ebs.VolumeId = "<VOL_ID>";  
  17. json.Reservations[0].Instances[0].ClientToken = "<CLIENT_TOKEN>";  
  18. json.Reservations[0].Instances[0].EbsOptimized = false;  
  19. json.Reservations[0].Instances[0].Hypervisor = "xen";  
  20. json.Reservations[0].Instances[0].ImageId = "<AMI_ID>";  
  21. json.Reservations[0].Instances[0].InstanceId = "<ID>";  
  22. json.Reservations[0].Instances[0].InstanceType = "t2.small";  
  23. json.Reservations[0].Instances[0].KeyName = "<KEY_NAME>";  
  24. json.Reservations[0].Instances[0].LaunchTime = "2016-08-31T22:39:37.000Z";  
  25. json.Reservations[0].Instances[0].Monitoring = {};  
  26. json.Reservations[0].Instances[0].Monitoring.State = "disabled";  
  27. json.Reservations[0].Instances[0].NetworkInterfaces = [];  
  28. json.Reservations[0].Instances[0].NetworkInterfaces[0] = {};  
  29. json.Reservations[0].Instances[0].NetworkInterfaces[0].Association = {};  
  30. json.Reservations[0].Instances[0].NetworkInterfaces[0].Association.IpOwnerId = "<OWNER_ID>";  
  31. json.Reservations[0].Instances[0].NetworkInterfaces[0].Association.PublicDnsName = "";  
  32. json.Reservations[0].Instances[0].NetworkInterfaces[0].Association.PublicIp = "<PUBLIC_IP>";  
  33. json.Reservations[0].Instances[0].NetworkInterfaces[0].Attachment = {};  
  34. json.Reservations[0].Instances[0].NetworkInterfaces[0].Attachment.AttachTime = "2016-08-21T22:00:40.000Z";  
  35. json.Reservations[0].Instances[0].NetworkInterfaces[0].Attachment.AttachmentId = "<ENI_ID>";  
  36. json.Reservations[0].Instances[0].NetworkInterfaces[0].Attachment.DeleteOnTermination = true;  
  37. json.Reservations[0].Instances[0].NetworkInterfaces[0].Attachment.DeviceIndex = 0;  
  38. json.Reservations[0].Instances[0].NetworkInterfaces[0].Attachment.Status = "attached";  
  39. json.Reservations[0].Instances[0].NetworkInterfaces[0].Description = "Primary network interface";  
  40. json.Reservations[0].Instances[0].NetworkInterfaces[0].Groups = [];  
  41. json.Reservations[0].Instances[0].NetworkInterfaces[0].Groups[0] = {};  
  42. json.Reservations[0].Instances[0].NetworkInterfaces[0].Groups[0].GroupId = "<SG_ID>";  
  43. json.Reservations[0].Instances[0].NetworkInterfaces[0].Groups[0].GroupName = "Bastion";  
  44. json.Reservations[0].Instances[0].NetworkInterfaces[0].MacAddress = "<MAC_ADDRESS>";  
  45. json.Reservations[0].Instances[0].NetworkInterfaces[0].NetworkInterfaceId = "<ENI_ID>";  
  46. json.Reservations[0].Instances[0].NetworkInterfaces[0].OwnerId = "<OWNER_ID>";  
  47. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddress = "<PRIVATE_IP>";  
  48. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddresses = [];  
  49. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddresses[0] = {};  
  50. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddresses[0].Association = {};  
  51. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddresses[0].Association.IpOwnerId = "<OWNER_ID>";  
  52. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddresses[0].Association.PublicDnsName = "";  
  53. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddresses[0].Association.PublicIp = "<PUBLIC_IP>";  
  54. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddresses[0].Primary = true;  
  55. json.Reservations[0].Instances[0].NetworkInterfaces[0].PrivateIpAddresses[0].PrivateIpAddress = "<PRIVATE_IP>";  
  56. json.Reservations[0].Instances[0].NetworkInterfaces[0].SourceDestCheck = true;  
  57. json.Reservations[0].Instances[0].NetworkInterfaces[0].Status = "in-use";  
  58. json.Reservations[0].Instances[0].NetworkInterfaces[0].SubnetId = "<SUBNET_ID>";  
  59. json.Reservations[0].Instances[0].NetworkInterfaces[0].VpcId = "<VPC_ID>";  
  60. json.Reservations[0].Instances[0].Placement = {};  
  61. json.Reservations[0].Instances[0].Placement.AvailabilityZone = "us-east-1a";  
  62. json.Reservations[0].Instances[0].Placement.GroupName = "";  
  63. json.Reservations[0].Instances[0].Placement.Tenancy = "default";  
  64. json.Reservations[0].Instances[0].PrivateDnsName = "<DNS_NAME>";  
  65. json.Reservations[0].Instances[0].PrivateIpAddress = "<PRIVATE_IP>";  
  66. json.Reservations[0].Instances[0].ProductCodes = [];  
  67. json.Reservations[0].Instances[0].PublicDnsName = "";  
  68. json.Reservations[0].Instances[0].PublicIpAddress = "<PUBLIC_IP>";  
  69. json.Reservations[0].Instances[0].RootDeviceName = "/dev/xvda";  
  70. json.Reservations[0].Instances[0].RootDeviceType = "ebs";  
  71. json.Reservations[0].Instances[0].SecurityGroups = [];...  


Munging gron Output Through Command Line Pipelining

JSON is more compact than the gron output, and suited for data structuring for transport and integration.  While more verbose, the gron output is a more usable format for text searching, filtering, and manipulation via Linux's text manipulation and filtering tools, or even sed and awk.  For example, consider the following commands:


$ cat ~/ec2.json | gron | grep AvailabilityZone
json.Reservations[0].Instances[0].Placement.AvailabilityZone = "us-east-1a";
The above command "pipeline" searches the gronned JSON for the text "AvailabilityZone" value, and returns the discrete assignment line.

$ cat ~/ec2.json | gron | grep AvailabilityZone|cut -d\" -f2
us-east-1a
The above pipeline extracts the AvailabilityZone value via the Linux cut command.

$ cat ~/ec2s.json | gron | grep InstanceId | cut -d\" -f2
...
<ID_1>
<ID_2>
<ID_3>
...
The above pipeline pulls all the EC2 instance IDs from the aws ec2 cli output, and creates a list of IDs.

Transforming JSON with gron and ungron (a.k.a. norg)

Earlier, I referenced the norg alias, that pointed to the ungron command.  With this command, gron will transform gron discrete assignments back into JSON.  Consider the commands below:
Note:  cat was removed and gron was called directly.

$ gron ~/ec2s.json | grep InstanceId | norg
...
{
      "Instances": [
        {
          "InstanceId": "<ID>"
        }
      ]
    },
    {
      "Instances": [
        {
          "InstanceId": "<ID>"
        }
      ]
    },
...
The above pipeline grons the JSON, greps for the InstanceId field, and then converts the lines of discrete assignments (json.Reservations[999].Instances[0].InstanceId = "<ID>";) from the grepped gron output back into usable and simplified JSON.

$ gron ~/ec2s.json | egrep InstanceId\|ImageId | norg
...
    {
      "Instances": [
        {
          "ImageId": "<AMI_ID>",
          "InstanceId": "<ID>"
        }
      ]
    },
    {
      "Instances": [
        {
          "ImageId": "<AMI_ID>",
          "InstanceId": "<ID>"
        }
      ]
    },
...
The above pipeline adds ImageId to the transformed JSON using egrep (Yes, I know GNU has deprecated egrep in lieu of grep -E.) .

sed

sed is a powerful stream editor, and is handy for executing find/replace algorithms on text files.
$ gron ~/ec2s.json | egrep InstanceId\|ImageId\|InstanceType | sed -e 's/Instances/node/g;s/ImageId/ami/g;s/InstanceType/type/g;s/InstanceId/id/g' | norg
...
{
      "node": [
        {
          "ami": "<AMI_ID>",
          "id": "<ID>",
          "type": "t2.small"
        }
      ]
    },
    {
      "node": [
        {
          "ami": "<AMI_ID>",
          "id": "<ID>",
          "type": "t2.micro"
        }
      ]
    },
...
The above pipeline adds stream editing with sed to perform multiple inline string replacements.

$ gron ~/ec2s.json | egrep InstanceId\|ImageId\|InstanceType | sed -e 's/Instances/node/g;s/ImageId/ami/g;s/InstanceType/type/g;s/InstanceId/id/g' | norg | tr -d '\n' | sed "s/ //g"
...
{"node":[{"ami":"<AMI_ID>","id":"<ID>","type":"t2.small"}]},{"node":[{"ami":"<AMI_ID>","id":"<ID>","type":"t2.micro"}]},
...
The above pipeline adds the translate command, tr, to remove newline characters and then another sed command to remove remaining whitespace.  This is handy for minimizing JSON files.

Summary

gron converts structured JSON into lines of discrete assignments.  This makes it easier to pipeline text to native tools like grep and sed to perform powerful text manipulation.  Once manipulated, the discrete assignments can be transformed back into JSON via the gron -u|--ungron command.  This makes gron a complement to existing tools like grep and sed, for munging (a.k.a. manipulating) JSON data.

12 comments:

  1. Crypto-currency as a modern form of the digital asset has received a worldwide acclaim for easy and faster financial transactions and its awareness among people have allowed them to take more interest in the field thus opening up new and advanced ways of making payments. Crypto.com Referral Code with the growing demand of this global phenomenon more,new traders and business owners are now willing to invest in this currency platform despite its fluctuating prices however it is quite difficult to choose the best one when the market is full. In the list of crypto-currencies bit-coins is one of the oldest and more popular Crypto.com Referral Code for the last few years. It is basically used for trading goods and services and has become the part of the so-called computerized block-chain system allowing anyone to use it thus increasing the craze among the public, Crypto.com Referral Code.

    Common people who are willing to purchase BTC can use an online wallet system for buying them safely in exchange of cash or credit cards and in a comfortable way from the thousands of BTC foundations around the world and keep them as assets for the future. Due to its popularity, many corporate investors are now accepting them as cross-border payments and the rise is unstoppable. With the advent of the internet and mobile devices,information gathering has become quite easy as a result the BTC financial transactions are accessible and its price is set in accordance with people’s choice and preferences thus leading to a profitable investment with Crypto.com Referral Code Code. Recent surveys have also proved that instability is good for BTC exchange as if there is instability and political unrest in the country due to which banks suffer then investing in BTC can surely be a better option. Again bit-coin transaction fees are pretty cheaper and a more convenient technology for making contracts thus attracting the crowd. The BTC can also be converted into different fiat currencies and is used for trading of securities, for land titles, document stamping, public rewards and vice versa.

    Another advanced block-chain project is Ethereumor the ETH which has served much more than just a digital form of crypto-currency Crypto.com Referral Code and its popularity in the last few decades have allowed billions of people to hold wallets for them. With the ease of the online world,the ETH have allowed the retailers and business organizations to accept them for trading purposes, therefore, can serve as the future of the financial system.

    ReplyDelete
  2. Our full Lace Front Wigs are all hand made with a lace cap. They are manufactured with thin lace sewn on top of the cap. Individual hairs are then sewn onto the thin lace. Each lace wig has lace all around the unit which will need to be cut prior to securing the wig to your head. You will need to cut along the hairline around your entire head. By doing so, you will be able to wear your hair anyway you like. You can even style ponytails, up-dos, etc. Once the Lace Wigs is successfully applied, it will appear that all the hair is growing directly from your head!

    Lace front wigs are hand-made with lace front cap & machine weft at back. Lace front wigs are manufactured with a thin lace that extends from ear to ear across the hairline. When you receive the wig, the lace will be quite long in the front. Cut and style according to your preference, as you will need to apply adhesive along the front of the wig. Once the wig is applied, you will still have Lace Wigs with a very natural appearance.
    TeamWigz Provide the Best Lace Front Wigs and Lace Wigs in Johannesburg and South Africa.

    ReplyDelete
  3. 우리카지노 에 오신 것을 환영합니다. 국내 최고의 카지노사이트 에 가입하여 바카라사이트 에서 다양한 게임을 즐기시면서 대박의 기회를 놓치지마세요! 우리 카지노는 한국의 바카라 산업을 지배하는 카지노 사이트입니다. 우리 카지노는 한국 바카라 시장 점유율의 50 % 이상을 차지하는 10 년 이상 온라인 바카라 시장을 지배 해 왔기 때문에 우리 카지노를 모르는 사람은 거의 없습니다.

    ARTICLE: 우리카지노는 대한민국의 바카라 업계를 장악하고 있는 카지노사이트 입니다. 우리카지노가 대한 민국에서 장악한 바카라 시장점유율이 50%가 넘고 10년 넘게 온라인 바카라 시장을 장악해왔기 때문에 대한민국에서는 우리카지노를 모르는 사람은 드뭅니다. 이런 바카라 업계의 독보적인 입지 때문에 늘 유명하거나 최고만을 찾는 사람들이 카지노사이트를 찾을때는 늘 우리카지노를 찾습니다.바카라를 처음 시작하시는 초보자분들에게도 우리카지노에서 카지노사이트를 시작하시기 좋은 환경입니다. 우리카지노사이트에서는 신규가입시 3만쿠폰을 지급 해주기 때문입니다. 사람들이 늘 1등만을 찾는 이유는 분명 있습니다. 다른 카지노사이트와는 달리 우리카지노를 이용하실시 에이전트를 끼고 게임을 하신다면 본사 이외에 활동쿠폰 및 오링쿠폰을 별도로 제공해주고 있기 때문입니다. 이러한 이유들 때문에 카지노사이트 업계에서 바카라를 즐기신다면 다들 우리카지노를 선호 하십니다. 카지노사이트에서 바카라를 이기기 물론 어렵습니다. 하지만 우리카지노의 에이전트를 끼고 바카라를 즐기신다면 승산이 있다고 봅니다. 우리카지노 에이전트의 연락처는 홈페이지로 연락하시면 언제든지 부담없이 소통가능 합니다. 카지노사이트를 선정할때는 바카라를 다른곳보다 유리하게 즐길 수 있는 카지노를 선택해야한다고 생각합니다. 그것이 바로 우리카지노 입니다. 이상으로 우리카지노와 바카라 카지노사이트 사이의 상관관계를 알아보았습니다바카라사이트.

    ReplyDelete
  4. Very nice this blog . this blog is very informative.such a great blogging website.

    "woman wine expert in usa
    "

    ReplyDelete