ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

DataFrame中的行动算子操作1

2022-08-30 12:30:51  阅读:178  来源: 互联网

标签:... 20 zs123456789123456789123 行动 ----------- DataFrame zs123456789123456 算子 pri


val conf = new SparkConf().setAppName("action").setMaster("local[*]")
val session = SparkSession.builder().config(conf).getOrCreate()

val seq: Seq[(String, Int)] = Array(
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 21),
  ("zs123456789123456789123", 22),
  ("zs123456789123456789123", 23),
  ("zs123456789123456789123", 24),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 21),
  ("zs123456789123456789123", 22),
  ("zs123456789123456789123", 23),
  ("zs123456789123456789123", 24),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 29),
  ("zs123456789123456789123", 30),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 20),
  ("zs123456789123456789123", 29),
  ("zs123456789123456789123", 30)
)
import session.implicits._
val frame: DataFrame = seq.toDF("namea", "ageb")

1. printSchema

def printSchemaOpt(frame: DataFrame): Unit = {
  println("-----------printschema操作开始-----------")
  frame.printSchema()
  println("-----------printschema操作结束-----------")
}
结果:
-----------printschema操作开始-----------
root
 |-- namea: string (nullable = true)
 |-- ageb: integer (nullable = false)

-----------printschema操作结束-----------

2. show

show():显示所有数据,最多显示20个字符,默认为true
show(n) :显示前n条数据,最多显示20个字符,默认为true
show(true): 最多显示20个字符,默认为true
show(false): 去除最多显示20个字符的限制
show(n, true):显示前n条并最多显示20个字符

def showOpt(frame: DataFrame) = {
  println("-----------show1操作开始-----------")
  frame.show()
  println("-----------show1操作结束-----------")
  println("-----------show2操作开始-----------")
  frame.show(3)
  println("-----------show2操作结束-----------")
  println("-----------show3操作开始-----------")
  frame.show(30, true)
  println("-----------show3操作结束-----------")
}
-----------show1操作开始-----------
+--------------------+----+
|               namea|ageb|
+--------------------+----+
|zs123456789123456...|  20|
|zs123456789123456...|  21|
|zs123456789123456...|  22|
|zs123456789123456...|  23|
|zs123456789123456...|  24|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  21|
|zs123456789123456...|  22|
|zs123456789123456...|  23|
|zs123456789123456...|  24|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
+--------------------+----+
only showing top 20 rows
-----------show1操作结束-----------
-----------show2操作开始-----------
+--------------------+----+
|               namea|ageb|
+--------------------+----+
|zs123456789123456...|  20|
|zs123456789123456...|  21|
|zs123456789123456...|  22|
+--------------------+----+
only showing top 3 rows

-----------show2操作结束-----------
-----------show3操作开始-----------
+--------------------+----+
|               namea|ageb|
+--------------------+----+
|zs123456789123456...|  20|
|zs123456789123456...|  21|
|zs123456789123456...|  22|
|zs123456789123456...|  23|
|zs123456789123456...|  24|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  21|
|zs123456789123456...|  22|
|zs123456789123456...|  23|
|zs123456789123456...|  24|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
|zs123456789123456...|  29|
|zs123456789123456...|  30|
|zs123456789123456...|  20|
|zs123456789123456...|  20|
+--------------------+----+
only showing top 30 rows

-----------show3操作结束-----------

3. first/head/take/takeAsList

def getDataOpt(frame: DataFrame): Unit = {
  println("-----------first操作开始-----------")
  val row: Row = frame.first()
  println(row.getAs[Int](1))
  println("-----------first操作结束-----------")
  println("-----------head操作开始-----------")
  val array: Array[Row] = frame.head(3)
  println(array.mkString("="))
  println("-----------head操作结束-----------")
  println("-----------take操作开始-----------")
  val arr: Array[Row] = frame.take(3)
  println(arr.mkString("="))
  println("-----------take操作结束-----------")
  println("-----------takeAsList操作开始-----------")
  val list: util.List[Row] = frame.takeAsList(3)
  println(list)
  println("-----------takeAsList操作结束-----------")
}
-----------first操作开始-----------
20
-----------first操作结束-----------
-----------head操作开始-----------
[zs123456789123456789123,20]=[zs123456789123456789123,21]=[zs123456789123456789123,22]
-----------head操作结束-----------
-----------take操作开始-----------
[zs123456789123456789123,20]=[zs123456789123456789123,21]=[zs123456789123456789123,22]
-----------take操作结束-----------
-----------takeAsList操作开始-----------
[[zs123456789123456789123,20], [zs123456789123456789123,21], [zs123456789123456789123,22]]
-----------takeAsList操作结束-----------

4. collect/collectAsList:慎用:获取DataFrame中的所有数据,将DataFrame在不同分区的数据拉取到同一个节点上,容易导致内存溢出

def collectOpt(frame: DataFrame): Unit = {
  println("-----------collect操作结束-----------")
  val array: Array[Row] = frame.collect()
  println(array.mkString("="))
  println("-----------collect操作结束-----------")
  println("-----------collectAsList操作开始-----------")
  val array1 = frame.collectAsList()
  println(array1)
  println("-----------collectAsList操作结束-----------")
}
-----------collect操作结束-----------
[zs123456789123456789123,20]=[zs123456789123456789123,21]=[zs123456789123456789123,22]=[zs123456789123456789123,23]=[zs123456789123456789123,24]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,21]=[zs123456789123456789123,22]=[zs123456789123456789123,23]=[zs123456789123456789123,24]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,29]=[zs123456789123456789123,30]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,20]=[zs123456789123456789123,29]=[zs123456789123456789123,30]
-----------collect操作结束-----------
-----------collectAsList操作开始-----------
[[zs123456789123456789123,20], [zs123456789123456789123,21], [zs123456789123456789123,22], [zs123456789123456789123,23], [zs123456789123456789123,24], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,21], [zs123456789123456789123,22], [zs123456789123456789123,23], [zs123456789123456789123,24], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,29], [zs123456789123456789123,30], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,20], [zs123456789123456789123,29], [zs123456789123456789123,30]]
-----------collectAsList操作结束-----------

标签:...,20,zs123456789123456789123,行动,-----------,DataFrame,zs123456789123456,算子,pri
来源: https://www.cnblogs.com/jsqup/p/16638826.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有