Bblfsh java driver cuts comments in Chinese language

I receive an UnicodeDecodeError when bblfsh parses the following java code extract:

public class ChineseComment {
    public final int aLot()
        return 1000;

Call to client.parse() is OK, but I receive an error when I want to get the FunctionGroup node
The following minimal python client code makes appear the problem:

import bblfsh

client = bblfsh.BblfshClient("")
ctx = client.parse('')
for unode in ctx.filter("//uast:FunctionGroup"):
    funcgroup_node = unode.get()

String “默认2倍” seems to be a valid utf8 string…

Thanks for reporting an issue. Can you please specify the version of bblfshd and Java driver?

Also, we discovered an issue related to UTF-8 comments recently and it will be fixed in the next release. (ETA: next week).

Thanks for your answer.

I do not know how to check version,
according to digests displayed by docker:
bblfsh/bblfshd latest-drivers sha256:899fe866f06c…
bblfsh/bblfshd latest sha256:921f4f25c102…

I think its 2.14 for both bblfsh and bblfsh-drivers

I try to use the latest version in the future…


Hi @mesnardo

Latest version of bblfshd has been shipped (2.16.1). Could you test if you are able to correctly parse your file?

# In case you have the bblfshd server running
docker stop bblfshd
# Delete old bblfshd
docker rm bblfshd
# Use latest one
docker run -d --name bblfshd --privileged -p 9432:9432 -v /var/lib/bblfshd:/var/lib/bblfshd bblfsh/bblfshd

Yes it’s ok now.
Thank you.