Why droping the table in hive is deleting the hdfs file from where we have loaded the file

I have dropped one table and after that i get to know my hdfs file also deleted from the location where i have referred the file to load in hive table.
Note :- my store location of table is different and hdfs file location is different.

@satya_prakash_gaurav
What kind of Hive table you have created internal or external(managed) table?

Internal Table: Once you have created Hive internal table, if you have loaded data from Local File System(LFS), then data will be imported to Hive. Here data in LFS will exists. Else you have loaded data from HDFS then data from HDFS will be moved to Hive warehouse. Because redundancy is not acceptable in data oriented systems. By dropping table, the metadata and data in hive warehouse both will be dropped.

External Table: Here only table (metadata) will be present in hive warehouse. But data will be referred from HDFS. By dropping table only metadata from Hive warehouse will be dropped. But the data in HDFS will be ceased to exists.

Dear Ravi,
Thanks for the reply. But i want to know that why same thing happening when i was taking import and export.
I have taken export of one table and stored in some other path /user/xx after that i have tried to import the data in new table from that path but i got SemanticException [Error 10027]: Invalid path in hive import

PS: i am able to load the data using load data.

@satya_prakash_gaurav,
Please post your question along with Queries you have tried.

@satya_prakash_gaurav

I do see that you have mentioned that " …where we have loaded the file to a hive table…"

Here is the point, If you have used LOAD DATA then your Source file would be moved and not copied to Hive Managed table. After load and before Delete/Drop the Hive table itself your source data has been moved. i.e Source data/file has been deleted.

Now, if you drop the hive table obviously the Managed table data would be removed also.

To avoid this, you can use External table to load your managed table using INSERT INTO… In this way, you can ensure that you will never loose your Source data.

Hope, it helps.

1 Like

@satya_prakash_gaurav- looks like this issue is related to your other post:

You can check the language manual :
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport