玩玩Microsoft Cognitive Services

Microsoft(MS) Build 2016刚落幕,围观充值信仰后发现除了Hololens、Windows Shell外,还出了个Cognitive Services的东西,简单的来说是微软开放了图像识别领域的部分API给开发者把玩,直播中的美女演示了在台上拍了一张照片,通过MS server解析后识别出了desk等等信息,看起来蛮好玩的,闲的在家,玩一玩这一系列API,除了VISION系列外,还有包括Speech/Language/Knowledge/Search等一系列黑科技智能API..时间有限,只玩了图像识别–返回图中基本信息和机器识别后的description。

官方资料

官方地址:cognitive-services
里面的文档很全,还有test console..比如Computer Vision APIAPI Console
有了这些文档完全就可以开发Android Demo了!当然,你还需要微软提供的开发者身份id,即subscriptions,这货是要在发起请求时带在Header中的,千万别忘记申请(貌似免费给了一个月时间?)。
对了,MS还放了Github,其中部分系列有Android Sample可以参考,并且给封装了API访问过程,方便使用。我这里是用Retrofit重新写了一遍算是对Retrofit初学的巩固…

开发流程

图像识别API,提供url和file两种方式去传递待分析图片。File的话就是利用Gallery->ContentResolver->拿到图片真正路径,multi-part上传…试了一下,上传有点慢,而且retrofit传multipart,接口总是返回415,放弃改玩url方式。
Http request如下:

POST https://api.projectoxford.ai/vision/v1.0/analyze?visualFeatures=Description,Tags,Categories,Faces,ImageType,Color,Adult HTTP/1.1
Content-Type: application/json
Host: api.projectoxford.ai
Content-Length: 74
Ocp-Apim-Subscription-Key: ••••••••••••••••••••••••••••••••
{“url”:”http://ww4.sinaimg.cn/large/7a8aed7bgw1exixcxfj12j20in0rsgp0.jpg"}

其中visualFeatures可选,见官方文档。Ocp-Apim-Subscription-Key是开发者id。request body是json格式的图片url,这图长这样:
to be analyze
返回结果,你感受一下:

{
  "categories": [
    {
      "name": "others_",
      "score": 0.1015625
    },
    {
      "name": "outdoor_",
      "score": 0.00390625
    }
  ],
  "adult": {
    "isAdultContent": false,
    "isRacyContent": false,
    "adultScore": 0.11355701088905335,
    "racyScore": 0.13894116878509522
  },
  "tags": [
    {
      "name": "person",
      "confidence": 0.999338686466217
    },
    {
      "name": "indoor",
      "confidence": 0.97037023305892944
    },
    {
      "name": "window",
      "confidence": 0.84870690107345581
    },
    {
      "name": "staring",
      "confidence": 0.48543870449066162
    }
  ],
  "description": {
    "tags": [
      "person",
      "indoor",
      "window",
      "looking",
      "front",
      "standing",
      "man",
      "woman",
      "staring",
      "laptop",
      "holding",
      "table",
      "black",
      "sitting",
      "shirt",
      "computer",
      "mirror",
      "room",
      "young",
      "glasses",
      "glass",
      "wearing",
      "screen",
      "white",
      "phone",
      "umbrella"
    ],
    "captions": [
      {
        "text": "person standing in front of a window",
        "confidence": 0.31947672828769375
      }
    ]
  },
  "requestId": "8f80d6ad-561c-4bd8-851e-ffc108958ac9",
  "faces": [],
  "color": {
    "dominantColorForeground": "Black",
    "dominantColorBackground": "Black",
    "dominantColors": [
      "Black",
      "White",
      "Grey"
    ],
    "accentColor": "424F59",
    "isBWImg": false
  },
  "imageType": {
    "clipArtType": 0,
    "lineDrawingType": 0
  }
}

抽取captions作为玩点~confidence是机器判断的可信度。结果如图:

result

代码

这Demo主要是API访问..放一下所有的接口,前提是懂Retrofit的基本使用。
MS Vision Analyze Interface:

@Headers({"Content-Type: application/json"})
@POST("vision/v1.0/analyze")
Call<VisionAnalyzeResult > analyzeImgFromUrl(@Header("Ocp-Apim-Subscription-Key") String key, @Query("visualFeatures") String visualFeatures,
                              @Body Url url);

其中:

public static final String API_BASE_URL = "https://api.projectoxford.ai/";
public static final String MS_VISION_SUBSCRIPTION = "xxx"//yours;

还有gank.io提供的图片API:

    @GET("{random}/data/{type}/{count}")
Call<GankModelWrapper<FuliModel>> getGankData(@Path("random") String random, @Path("type") String type, @Path("count") String count);

小结

感觉是在大公司光环之下喝点汤玩玩API…抛砖引玉,感觉可以做出更有big的App