SKIP TO CONTENT

Processing PostgreSQL JSON and jsonb_array_elements()

Padma Karumuri
July 7, 2023
5 min read

PostgreSQL offers two JSON data types for storing JSON data: json and jsonb. To implement efficient query mechanisms for these data types, the json and jsonb data types accept almost identical sets of values as input.

The major practical difference is one of efficiency. The json data type stores an exact copy of the input text, which processing functions must reparse on each execution, while jsonb data is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonb also supports indexing, which can be a significant advantage.

Because the json data type stores an exact copy of the input text, it will preserve semantically-insignificant white space between tokens, as well as the order of keys within JSON objects. Also, if a JSON object within the value contains the same key more than once, all the key/value pairs are kept (the processing functions consider the last value as the operative one). By contrast, jsonb does not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept.

In general, most applications should prefer to store JSON data as jsonb, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys.  

JSONB benefits:

  • More efficiency
  • Significantly faster processing
  • Supports indexing (which can be a significant advantage, as we'll see later)
  • Simpler schema designs (replacing entity-attribute-value (EAV) tables with jsonb columns, which can be queried, indexed and joined, allowing for performance improvements up until 1000x)

 

JSONB drawbacks:

  • Slightly slower input (due to added conversion overhead)
  • It may take more disk space than plain json due to a larger table footprint
  • Certain queries (especially aggregate ones) may be slower due to the lack of statistics

The PostgreSQL json_typeof() function returns a text string that is the type of the specified JSON value.

Possible returned values are: object, array, string, number, boolean and null. If you provide a NULL parameter, the json_typeof() function will return NULL. jsonb_array_elements() and json_array_elements():

This works efficiently when you have the data in array of elements. But your data always might not have array. In those instances, your query will fail with the following error message.

An error message reading "cannot extract elements from an object. SQL state: 22023"

This is a practical example how to avoid this issue. Let’s create the table:   


   CREATE TABLE public.person_info
        (
            person_info_id serial,
            person_jsn jsonb
        )
        WITH (
            OIDS = FALSE
        );
    insert into person_info(person_jsn) values (format('{"PERSON_DTLS": { "PERSON_ID": "1","MDL_NAME": "Rene","LAST_NAME": "Holler","FIRST_NAME": "Shannon"},
    "PERSON_MLG_ADR": [
        {
            "ZIP_CD": 210441234,
            "FAX_NUM": 7071112222,      
            "TEL_NUM": 7072223333,
            "CNTRY_CD": "US",      
            "STATE_CD": "CA",
            "CITY_NAME": "EUREKA",
            "EMAIL_ADR": "TEST@YAHOO.COM",
            "ADR_TYPE_DESC": "MAILING ADDRESS",
            "LINE_1_ST_ADR": "123 5TH ST",
            "LINE_2_ST_ADR": "SUITE A"
        },
        {
            "ZIP_CD": 210753421,
            "FAX_NUM": 7071114444,      
            "TEL_NUM": 7071115555,
            "CNTRY_CD": "US",
            "STATE_CD": "CA",
            "CITY_NAME": "EUREKA",
            "EMAIL_ADR": "TEST@gmail.com",
            "ADR_TYPE_DESC": "MAILING ADDRESS",
            "LINE_1_ST_ADR": "140 8TH ST",
            "LINE_2_ST_ADR": "STE E"
            }

        ]

    }')::jsonb);

A data output table displaying example address information for Shannon Holler at two locations

    insert into person_info(person_jsn) values (format('{"PERSON_DTLS": { "PERSON_ID": "2","MDL_NAME": "K","LAST_NAME": "Mark","FIRST_NAME": "Terry"},         "PERSON_MLG_ADR": {
            "ZIP_CD": 371101664,
            "FAX_NUM": 4444444444,
            "TEL_NUM": 9314735387,
            "CNTRY_CD": "US",
            "STATE_CD": "TN",
            "CITY_NAME": "MC MINNVILLE",
            "EMAIL_ADR": "mark.terry@gmail.com",    
            "ADR_TYPE_DESC": "MAILING ADDRESS",
            "LINE_1_ST_ADR": "1100 SMITHVILLE HWY",
            "LINE_2_ST_ADR": "STE 114"
        }

    }')::jsonb);

A data output table displaying example address information for Mark Terry at one location

Now let’s execute the query to extract the array data:

    select person_jsn->'PERSON_DTLS'->>'PERSON_ID' AS person_id,
        person_jsn->'PERSON_DTLS'->>'LAST_NAME' AS last_name,
        person_jsn->'PERSON_DTLS'->>'FIRST_NAME' AS first_name,
        person_jsn->'PERSON_DTLS'->>'MDL_NAME' AS mdl_name,
        mlg_adr->>'LINE_1_ST_ADR' AS line_1_st_addr,
        mlg_adr->>'LINE_2_ST_ADR' AS line_2_st_addr,
        mlg_adr->>'CITY_NAME' AS city_name,
        mlg_adr->>'STATE_CD' AS state_cd,
        mlg_adr->>'ZIP_CD' AS zip_cd
    from public.person_info p,jsonb_array_elements(person_jsn->'PERSON_MLG_ADR') as mlg_adr  

An error message reading "cannot extract elements from an object. SQL state: 22023"

You can avoid this error by checking the jsonb_typeof check.  

    select person_jsn->'PERSON_DTLS'->>'PERSON_ID' AS person_id,
        person_jsn->'PERSON_DTLS'->>'LAST_NAME' AS last_name,
        person_jsn->'PERSON_DTLS'->>'FIRST_NAME' AS first_name,
        person_jsn->'PERSON_DTLS'->>'MDL_NAME' AS mdl_name,
        mlg_adr->>'LINE_1_ST_ADR' AS line_1_st_addr,
        mlg_adr->>'LINE_2_ST_ADR' AS line_2_st_addr,
        mlg_adr->>'CITY_NAME' AS city_name,
        mlg_adr->>'STATE_CD' AS state_cd,
        mlg_adr->>'ZIP_CD' AS zip_cd
    from public.person_info p,jsonb_array_elements(person_jsn->'PERSON_MLG_ADR') as mlg_adr  
    where jsonb_typeof(person_jsn->'PERSON_MLG_ADR') = 'array'  

    "person_id" "last_name" "first_name" "mdl_name" "line_1_st_addr" "line_2_st_addr" "city_name" "state_cd" "zip_cd"

    "1" "Holler" "Shannon" "Rene" "123 5TH ST" "SUITE A" "EUREKA" "CA" "210441234"

    "1" "Holler" "Shannon" "Rene" "140 8TH ST" "STE E" "EUREKA" "CA" "210753421"

To get the other non-array data you use the following query:

    select person_jsn->'PERSON_DTLS'->>'PERSON_ID' AS person_id,
        person_jsn->'PERSON_DTLS'->>'LAST_NAME' AS last_name,
        person_jsn->'PERSON_DTLS'->>'FIRST_NAME' AS first_name,
        person_jsn->'PERSON_DTLS'->>'MDL_NAME' AS mdl_name,
        person_jsn->'PERSON_MLG_ADR'->>'LINE_1_ST_ADR' AS line_1_st_addr,
        person_jsn->'PERSON_MLG_ADR'->>'LINE_2_ST_ADR' AS line_2_st_addr,
        person_jsn->'PERSON_MLG_ADR'->>'CITY_NAME' AS city_name,
        person_jsn->'PERSON_MLG_ADR'->>'STATE_CD' AS state_cd,
        person_jsn->'PERSON_MLG_ADR'->>'ZIP_CD' AS zip_cd
    from public.person_info p
    where jsonb_typeof(person_jsn->'PERSON_MLG_ADR') <> 'array'

    "person_id" "last_name" "first_name" "mdl_name" "line_1_st_addr" "line_2_st_addr" "city_name" "state_cd" "zip_cd" "2" "Mark"     "Terry" "K" "1100 SMITHVILLE HWY" "STE 114" "MC MINNVILLE" "TN" "371101664"

To get the entire data at one time use the following:

    select person_jsn->'PERSON_DTLS'->>'PERSON_ID' AS person_id,
        person_jsn->'PERSON_DTLS'->>'LAST_NAME' AS last_name,
        person_jsn->'PERSON_DTLS'->>'FIRST_NAME' AS first_name,
        person_jsn->'PERSON_DTLS'->>'MDL_NAME' AS mdl_name,
        mlg_adr->>'LINE_1_ST_ADR' AS line_1_st_addr,
        mlg_adr->>'LINE_2_ST_ADR' AS line_2_st_addr,
        mlg_adr->>'CITY_NAME' AS city_name,
        mlg_adr->>'STATE_CD' AS state_cd,
        mlg_adr->>'ZIP_CD' AS zip_cd
    from public.person_info p,jsonb_array_elements(person_jsn->'PERSON_MLG_ADR') as mlg_adr  
    where jsonb_typeof(person_jsn->'PERSON_MLG_ADR') = 'array'
    union all
    select person_jsn->'PERSON_DTLS'->>'PERSON_ID' AS person_id,
        person_jsn->'PERSON_DTLS'->>'LAST_NAME' AS last_name,
        person_jsn->'PERSON_DTLS'->>'FIRST_NAME' AS first_name,
        person_jsn->'PERSON_DTLS'->>'MDL_NAME' AS mdl_name,
        person_jsn->'PERSON_MLG_ADR'->>'LINE_1_ST_ADR' AS line_1_st_addr,
        person_jsn->'PERSON_MLG_ADR'->>'LINE_2_ST_ADR' AS line_2_st_addr,
        person_jsn->'PERSON_MLG_ADR'->>'CITY_NAME' AS city_name,
        person_jsn->'PERSON_MLG_ADR'->>'STATE_CD' AS state_cd,
        person_jsn->'PERSON_MLG_ADR'->>'ZIP_CD' AS zip_cd
    from public.person_info p
    where jsonb_typeof(person_jsn->'PERSON_MLG_ADR') <> 'array'

A data output table showing the combined example address information for Shannon Holler and Mark Terry

Each function in this pair requires that the supplied JSON value is an array. This is very useful when extracting the data from jsonb_array_elements

Sign up for our newsletter to join our impact-driven mission.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.