Set Empty Fields to Null in Hive

Depending on the data you load into Hive/HDFS, some of your fields might be empty. Having Hive interpret those empty fields as nulls can be very convenient. It is easy to do this in the table definition using the serialization.null.format table property.

Here is a an example from the Big Datums GitHub repo :

2 thoughts on “Set Empty Fields to Null in Hive”

  1. Ganesh

    TBLPROPERTIES(‘serialization.null.format’=”) means the following:

    An empty field in the data files will be treated as NULL when you query the table
    When inserting rows to the table, NULL values will be written to the data files as empty fields

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">