ICode9

精准搜索请尝试: 精确搜索
首页 > 编程语言> 文章详细

Java实现一个简单的相似度算法以及从集合中选出与指定对象最接近或者最不接近的两个对象

2021-01-09 13:04:21  阅读:169  来源: 互联网

标签:return name 对象 base obj Java null 接近 id


之前遇到一个需求需要做数据筛选上报以便控制峰值,我们想从集合中选取出变化最大的记录上传,集合的个数、集合类型、或者集合类元素的类型都不确定,于是在网上寻找相关的功能代码,奈何没找到,于是自己写了一个

定义相似度计算基本规则

  • 已第一个参数p1为基准计算第二个参数p2与p1的相似度
  • 基本数据类型中的数值类型统一使用父类Number类型进行计算
  • 计算的结果为1表示完全相同,为0表示完全不同
  • 不是同类型的对象为0
  • 内存地址或者内容相同为1
  • 数值计算方式fun(p1,p2):
    • 第二个参数比第一个参数大:s=(p1 * 2)/(p1+p2) 计算结果0-1
    • 第一个参数比第二个参数大:s=(p2 * 2)/(p1+p2) 计算结果0-1
  • char类型转为number进行计算
  • 字符型计算方式:
    • 首先计算长度相似度为s1=fun(l1,l2)(数值计算方式)
    • 在计算第二个字符串与第一个字符串相同个数c与第一个字符串的长度之间的相似度s2=fun(l1,c)
    • 返回:s=s1*s2 计算结果0-1
  • 布尔类型相等返回1,不相等返回0
  • 非集合类对象分别计算两个对象(o1与o2)每个字段的相似度并求和sum除以字段的个数c:s=sum(fun(o1,o2))/c 计算结果0-1
  • 两个参数都是集合时(第一个P[p1,p2...pn],第二个Q[q1,q2...qn]):
    • 首先计算长度相似度为s1=fun(Pl,Ql)(数值计算方式)
    • 在计算第二个集合与第一个集合相同个数c与第一个集合的长度之间的相似度s2=fun(Pl,c)
    • 返回:s=s1*s2 计算结果0-1

从集合(只支持colection集合)中取合适对象

  • 取最相似的元素:遍历集合参数P[p1,p2...pn],分别计算p1..pn与第二个参数p之间的相似度,取最大的一个相似度值对应的元素返回
  • 取最不相似的元素:遍历集合参数P[p1,p2...pn],分别计算p1..pn与第二个参数p之间的相似度,取最小的一个相似度值对应的元素返回

代码:

最相似元素:

    /**
     * 相似度最大计算
     *
     * @return
     */
    public static <T> T getMaxSimilarity(Collection<T> base, T t) {
        if (base == null || t == null || base.size() == 0) {
            return null;
        }
        T max = null;
        double maxSimilarity = 0;
        for (T obj : base) {
            double v = similarity(t, obj);
            System.out.println("maxSimilarity:" + v);
            if (maxSimilarity <= v || max == null) {
                max = obj;
                maxSimilarity = v;
            }
        }
        return max;
    }

取最不相似度元素:

    /**
     * 相似度最小计算
     *
     * @return
     */
    public static <T> T getMinSimilarity(Collection<T> base, T t) {
        if (base == null || t == null || base.size() == 0) {
            return null;
        }
        T min = null;
        double minSimilarity = 0;
        for (T obj : base) {
            double v = similarity(t, obj);
            System.out.println("minSimilarity:" + v);
            if (minSimilarity >= v || min == null) {
                min = obj;
                minSimilarity = v;
            }
        }
        return min;
    }

相似度计算:

/**
     * 相似度计算
     *
     * @return
     */
    public static double similarity(Object base, Object obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        if (!base.getClass().equals(obj.getClass())) {
            return 0;
        }
        if (base instanceof Boolean) {
            return booleanSimilarity((Boolean) base, ((Boolean) obj));
        }
        if (base instanceof Character) {
            return numberSimilarity((int) base, (int) obj);
        }
        if (base instanceof Number) {
            return numberSimilarity((Number) base, (Number) obj);
        }
        if (base instanceof CharSequence) {
            return stringSimilarity(base.toString(), obj.toString());
        }
        if (base instanceof Collection) {
            return collectionSimilarity((Collection) base, (Collection) obj);
        }
        if (base instanceof Map) {
            return mapSimilarity((Map) base, (Map) obj);
        }
        if (base.getClass().isArray()) {
            return arraySimilarity(base, obj);
        }
        return objectSimilarity(base, obj);
    }

查看所有代码:

//package me.muphy.util;

import java.lang.reflect.Array;
import java.lang.reflect.Field;
import java.util.Collection;
import java.util.Map;

/**
 * 对象相似度算法
 * getMaxSimilarity 重集合中获取最接近的两个对象
 * getMinSimilarity 重集合中获取最不接近的两个对象
 * similarity 比较两个对象的相似度
 */
public class SimilarityAlgorithmUtils {

    /**
     * 相似度最大计算
     *
     * @return
     */
    public static <T> T getMaxSimilarity(Collection<T> base, T t) {
        if (base == null || t == null || base.size() == 0) {
            return null;
        }
        T max = null;
        double maxSimilarity = 0;
        for (T obj : base) {
            double v = similarity(t, obj);
            System.out.println("maxSimilarity:" + v);
            if (maxSimilarity <= v || max == null) {
                max = obj;
                maxSimilarity = v;
            }
        }
        return max;
    }

    /**
     * 相似度最小计算
     *
     * @return
     */
    public static <T> T getMinSimilarity(Collection<T> base, T t) {
        if (base == null || t == null || base.size() == 0) {
            return null;
        }
        T min = null;
        double minSimilarity = 0;
        for (T obj : base) {
            double v = similarity(t, obj);
            System.out.println("minSimilarity:" + v);
            if (minSimilarity >= v || min == null) {
                min = obj;
                minSimilarity = v;
            }
        }
        return min;
    }

    /**
     * 相似度计算
     *
     * @return
     */
    public static double similarity(Object base, Object obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        if (!base.getClass().equals(obj.getClass())) {
            return 0;
        }
        if (base instanceof Boolean) {
            return booleanSimilarity((Boolean) base, ((Boolean) obj));
        }
        if (base instanceof Character) {
            return numberSimilarity((int) base, (int) obj);
        }
        if (base instanceof Number) {
            return numberSimilarity((Number) base, (Number) obj);
        }
        if (base instanceof CharSequence) {
            return stringSimilarity(base.toString(), obj.toString());
        }
        if (base instanceof Collection) {
            return collectionSimilarity((Collection) base, (Collection) obj);
        }
        if (base instanceof Map) {
            return mapSimilarity((Map) base, (Map) obj);
        }
        if (base.getClass().isArray()) {
            return arraySimilarity(base, obj);
        }
        return objectSimilarity(base, obj);
    }

    /**
     * 数值相似度
     *
     * @param base
     * @param obj
     * @return
     */
    private static double arraySimilarity(Object base, Object obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        int baseLength = Array.getLength(base);
        int objLength = Array.getLength(obj);
        double k = 0;
        for (int i = 0; i < baseLength; i++) {
            for (int j = 0; j < objLength; j++) {
                k += objectSimilarity(Array.get(base, i), Array.get(obj, j)); //取出数组中每个值
            }
        }
        return numberSimilarity(baseLength, objLength) * numberSimilarity(baseLength, k);
    }

    /**
     * 数值相似度
     *
     * @param base
     * @param obj
     * @return
     */
    private static double mapSimilarity(Map base, Map obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        double k = 0;
        for (Object key : base.keySet()) {
            if (obj.containsKey(key)) {
                k += objectSimilarity(base.get(key), obj.get(key));
            }
        }
        return numberSimilarity(base.size(), obj.size()) * numberSimilarity(base.size(), k);
    }

    /**
     * 数值相似度
     *
     * @param base
     * @param obj
     * @return
     */
    private static double collectionSimilarity(Collection base, Collection obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        int k = 0;
        for (Object o1 : base) {
            for (Object o2 : obj) {
                k += objectSimilarity(o1, o2);
            }
        }
        return numberSimilarity(base.size(), obj.size()) * numberSimilarity(base.size(), k);
    }

    /**
     * 数值相似度
     *
     * @param base
     * @param obj
     * @return
     */
    private static double objectSimilarity(Object base, Object obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        if (base.equals(obj)) {
            return 1;
        }
        Field[] fields = base.getClass().getFields();
        double sum = 0, k = 0;
        for (Field field : fields) {
            Object o1 = null, o2 = null;
            try {
                o1 = field.get(base);
                o2 = field.get(obj);
            } catch (IllegalAccessException e) {
                e.printStackTrace();
            }
            k++;
            sum += similarity(o1, o2);
        }
        if (k == 0) {
            return 0;
        }
        return sum / k;
    }

    /**
     * 数值相似度
     *
     * @param base
     * @param obj
     * @return
     */
    private static double booleanSimilarity(Boolean base, Boolean obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        return base.booleanValue() == obj.booleanValue() ? 1 : 0;
    }

    /**
     * 数值相似度
     *
     * @param base
     * @param obj
     * @return
     */
    private static double charSimilarity(Character base, Character obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        return numberSimilarity((int) base.charValue(), (int) obj.charValue());
    }

    /**
     * 数值相似度
     *
     * @param base
     * @param obj
     * @return
     */
    private static double numberSimilarity(Number base, Number obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        double b = base.doubleValue();
        double o = obj.doubleValue();
        double sum = b + o;
        if (sum < 0.00001) {
            if (b < 0.00001) {
                return 1;
            }
        }
        if (o >= b) {
            return b * 2 / sum;
        }
        return o * 2 / sum;
    }

    /**
     * 字符串相似度
     *
     * @param base
     * @param obj
     * @return
     */
    private static double stringSimilarity(String base, String obj) {
        if (base == obj) {
            return 1;
        }
        if (base == null || obj == null) {
            return 0;
        }
        int k = 0;
        for (int i = 0; i < base.length(); i++) {
            for (int j = 0; j < obj.length(); j++) {
                if (base.charAt(i) == obj.charAt(j)) {
                    k++;
                }
            }
        }
        if (base.length() != obj.length()) {
            return numberSimilarity(base.length(), obj.length()) * numberSimilarity(base.length(), k);
        }
        return numberSimilarity(base.length(), k);
    }
}
View Code

测试

查看测试代码:

//package me.muphy.temp;

import com.alibaba.fastjson.JSON;
import com.moefon.scenemonitor.util.SimilarityAlgorithmUtils;

import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Objects;

public class Test {
    public Test() {

    }

    public static void main(String[] args) {
        Test.arrayTest();
        Test.listTest();
        Test.mapTest();
        Test.intSimilarityTest();
        Test.objectSimilarityTest();
    }

    public static void intSimilarityTest() {
        List<Integer> ints1 = new ArrayList<Integer>() {{
            add(7);
            add(8);
            add(10);
            add(12);
        }};
        int i = 9;
        Integer minSimilarity = SimilarityAlgorithmUtils.getMinSimilarity(ints1, i);
        System.out.println("list:" + JSON.toJSONString(ints1) + "与" + i + "之间的最不接近的是:" + minSimilarity);
        Integer maxSimilarity = SimilarityAlgorithmUtils.getMaxSimilarity(ints1, i);
        System.out.println("list:" + JSON.toJSONString(ints1) + "与" + i + "之间的最接近的是:" + maxSimilarity);
    }

    public static void objectSimilarityTest() {
        List<A> ints1 = new ArrayList<A>() {{
            add(new A("oyerwhflkf,dasf"));
            add(new A("hjkloipsggsdfd"));
            add(new A("ppiffalshfklhf"));
            add(new A("asdfasljlsadhkn"));
        }};
        A a = new A("pfasfklh");
        A minSimilarity = SimilarityAlgorithmUtils.getMinSimilarity(ints1, a);
        System.out.println("list:" + JSON.toJSONString(ints1) + "与" + a.toString() + "之间的最不接近的是:" + minSimilarity);
        A maxSimilarity = SimilarityAlgorithmUtils.getMaxSimilarity(ints1, a);
        System.out.println("list:" + JSON.toJSONString(ints1) + "与" + a.toString() + "之间的最接近的是:" + maxSimilarity);
    }

    public static void arrayTest() {
        int[] ints1 = {5, 6, 7, 8};
        int[] ints2 = {7, 8, 9, 1};
        double similarity = SimilarityAlgorithmUtils.similarity(ints1, ints2);
        System.out.println("数组:" + JSON.toJSONString(ints1) + "与" + JSON.toJSONString(ints2) + "之间的相似度:" + similarity);
    }

    public static void mapTest() {
        Map<String, A> map1 = new HashMap<>();
        Map<String, A> map2 = new HashMap<>();
        A a1 = new A("a1");
        A a2 = new A("a2");
        A a3 = new A("a3");
        A a4 = new A("a4");
        A a5 = new A("a5");
        A a6 = new A("a6");
        A a7 = new A("a7");
        map1.put("a1", a1);
        map1.put("a2", a2);
        map1.put("a3", a3);
        map2.put("a3", a3);
        map1.put("a4", a4);
        map2.put("a4", a4);
        map1.put("a5", a5);
        map2.put("a5", a5);
        map2.put("a6", a6);
        map1.put("a6", a6);
        map2.put("a7", a7);
        double similarity = SimilarityAlgorithmUtils.similarity(map1, map2);
        System.out.println("map:" + JSON.toJSONString(map1) + "与" + JSON.toJSONString(map2) + "之间的相似度:" + similarity);
    }

    public static void listTest() {
        List<A> list1 = new ArrayList<>();
        List<A> list2 = new ArrayList<>();
        A a1 = new A("a1");
        A a2 = new A("a2");
        A a3 = new A("a3");
        list1.add(a1);
        list1.add(a2);
        list1.add(a3);
        list1.add(new A("a4"));
        list1.add(new A("a3"));
        list2.add(new A("a3"));
        list2.add(a2);
        list2.add(new A("a4"));
        list2.add(new A("a5"));
        double similarity = SimilarityAlgorithmUtils.similarity(list1, list2);
        System.out.println("list:" + JSON.toJSONString(list1) + "与" + JSON.toJSONString(list2) + "之间的相似度:" + similarity);
    }

    static class A {
        public String name;
        public int id = 5;

        public A(String name) {
            this.name = name;
        }

        @Override
        public String toString() {
            return "A{" +
                    "name='" + name + '\'' +
                    ", id=" + id +
                    '}';
        }

        @Override
        public boolean equals(Object o) {
            if (this == o) return true;
            if (o == null || getClass() != o.getClass()) return false;
            A a = (A) o;
            return Objects.equals(id, a.id) && Objects.equals(name, a.name);
        }

        @Override
        public int hashCode() {
            return Objects.hash(name) * id;
        }
    }
}
View Code

测试结果:

I/System.out: 数组:[5,6,7,8]与[7,8,9,1]之间的相似度:0.4
I/System.out: list:[{"id":5,"name":"a1"},{"id":5,"name":"a2"},{"id":5,"name":"a3"},{"id":5,"name":"a4"},{"id":5,"name":"a3"}]与[{"id":5,"name":"a3"},{"id":5,"name":"a2"},{"id":5,"name":"a4"},{"id":5,"name":"a5"}]之间的相似度:0.7901234567901234
I/System.out: map:{"a1":{"id":5,"name":"a1"},"a2":{"id":5,"name":"a2"},"a3":{"id":5,"name":"a3"},"a4":{"id":5,"name":"a4"},"a5":{"id":5,"name":"a5"},"a6":{"id":5,"name":"a6"}}与{"a3":{"id":5,"name":"a3"},"a4":{"id":5,"name":"a4"},"a5":{"id":5,"name":"a5"},"a6":{"id":5,"name":"a6"},"a7":{"id":5,"name":"a7"}}之间的相似度:0.7272727272727273
list:[7,8,10,12]与9之间的最不接近的是:12
list:[7,8,10,12]与9之间的最接近的是:10
I/System.out: list:[{"id":5,"name":"oyerwhflkf,dasf"},{"id":5,"name":"hjkloipsggsdfd"},{"id":5,"name":"ppiffalshfklhf"},{"id":5,"name":"asdfasljlsadhkn"}]与A{name='pfasfklh', id=5}之间的最不接近的是:A{name='ppiffalshfklhf', id=5}
I/System.out: list:[{"id":5,"name":"oyerwhflkf,dasf"},{"id":5,"name":"hjkloipsggsdfd"},{"id":5,"name":"ppiffalshfklhf"},{"id":5,"name":"asdfasljlsadhkn"}]与A{name='pfasfklh', id=5}之间的最接近的是:A{name='hjkloipsggsdfd', id=5}

 

就这样吧,我可能自己都不知道在写啥~~

标签:return,name,对象,base,obj,Java,null,接近,id
来源: https://www.cnblogs.com/muphy/p/14254619.html

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有