ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

自定义udtf函数(一进多出)

2022-08-04 13:33:36  阅读:164  来源: 互联网

标签:自定义 org udtf hive hadoop import apache 一进 ArrayList


案例要求

java编写

package udtf;

import org.apache.hadoop.hive.ql.exec.UDFArgumentException;
import org.apache.hadoop.hive.ql.metadata.HiveException;
import org.apache.hadoop.hive.ql.udf.generic.GenericUDTF;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;
import org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory;
import java.util.ArrayList;
import java.util.List;

public class MyExplode extends GenericUDTF {

    @Override
    public StructObjectInspector initialize(StructObjectInspector argOIs) throws UDFArgumentException {
        List<String> columnNames = new ArrayList<String>();
        columnNames.add("user");
        List<ObjectInspector> objectInspectors = new ArrayList<ObjectInspector>();
        objectInspectors.add(PrimitiveObjectInspectorFactory.javaStringObjectInspector);
        return ObjectInspectorFactory.getStandardStructObjectInspector(columnNames, objectInspectors);
    }

    public void process(Object[] args) throws HiveException {
        String str = args[0].toString();
        String split = args[1].toString();
        String[] strings = str.split(split);
        for (String s : strings) {
            ArrayList<String> list = new ArrayList<String>();
            list.add(s);
            forward(list);
        }
    }

    public void close() throws HiveException {

    }
}

shell

hive (default)> create temporary function myexplode as "udtf.MyExplode" using jar "hdfs://node1:9000/hive_function-1.0-SNAPSHOT.jar";
Added [/tmp/10de4466-6601-49b1-b749-8b5c8c2809b2_resources/hive_function-1.0-SNAPSHOT.jar] to class path
Added resources: [hdfs://node1:9000/hive_function-1.0-SNAPSHOT.jar]
OK
Time taken: 5.442 seconds


hive (default)> create table a(name string);
OK
Time taken: 1.046 seconds


hive (default)> insert into table a values("zs_ls_ww"),("ww_ml_wb");


hive (default)> select myexplode(name, "_") from a;
OK
user
zs
ls
ww
ww
ml
wb
Time taken: 1.138 seconds, Fetched: 6 row(s)

标签:自定义,org,udtf,hive,hadoop,import,apache,一进,ArrayList
来源: https://www.cnblogs.com/jsqup/p/16550253.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有