Social Media Analytics (SMA)

Viewing 1 reply thread
  • Author
    Posts
    • #6802
      Nurul JannahNurul Jannah
      Participant

      Saya Nurul Jannah, izin bertanya.

      Saya mencoba melakukan streaming data twitter dengan beberapa keyword spesifik dan bermaksud langsung memasukkannya ke database elasticsearch. Namun setelah di run beberapa saat muncul error ‘ChunkedEncodingError: (‘Connection broken: IncompleteRead(0 bytes read, 1 more expected)’, IncompleteRead(0 bytes read, 1 more expected))‘. Bagaimana solusinya ya? Dan apakah data yang sudah terambil dan masuk ke database sebelum muncul error dapat/boleh digunakan untuk penelitian?

      Adapun code yang saya gunakan merupakan code pada modul sma_01 (dan sedikit modifikasi) atau pun sebagai berikut:

      import warnings; warnings.simplefilter('ignore')
      from elasticsearch import Elasticsearch as Es
      
      server, port, timeout = 'localhost', 9200, 30  # local host = 172.27.0.1
      try:
          conEs = Es( [ {'host':server,'port':port,'timeout':timeout} ], verify_certs=True)
          if conEs.ping():
              print('Connected to ElasticSearch, koneksi = "conEs"')
          else:
              raise ValueError("Error 01, tidak bisa terkoneksi ke ElasticSearch. Yakinkan server ES sudah berjalan dengan baik dan port serta ip server benar")
      except:
          print('Error 02, tidak bisa terkoneksi ke ElasticSearch. Yakinkan server Es sudah berjalan dengan baik dan port serta ip server benar')
      
      def loadKeys(file='twitter_API.txt'):
          file = open(file, 'r', encoding="utf-8", errors='replace')
          keys = file.readlines()
          file.close()
          keys = [k.strip() for k in keys]
          return keys
      
      Ck, Cs, At, As = loadKeys()
      'Done'
      
      from twython import Twython
      
      try:
          twitter = Twython(Ck, Cs, At, As)
          user = twitter.verify_credentials()
          print('Welcome "%s" you are now connected to twitter server' %user['name'])
      except:
          print("Connection failed, please check your API keys or connection")
      
      from twython import TwythonStreamer
      
      def streamToElastic(topicS, lang):
          class MyStreamer(TwythonStreamer):
              def on_success(self, data):
                  global count
                  count+=1
                  D = {"created_at":data['created_at'], 
                       "username":data['user']['screen_name'], 
                       "tweet":data['text'],
                       "id":data['id'], 
                       "hashtags":data['entities']['hashtags'], 
                       "user_mentions":data['entities']['user_mentions'], 
                       "fav_count":data['user']['favourites_count'], 
                       "statuses_count": data['user']['statuses_count'], 
                       "followers_count":data['user']['followers_count'],
                       "friends_count": data['user']['friends_count'],
                       "favourites_count": data['user']['favourites_count'],
                       "verified": data['user']['verified'],
                       "statuses_count": data['user']['statuses_count'],
                       "retweet_count": data['retweet_count'], 
                       "favorite_count": data['favorite_count']}
                  
                  conEs.index(index="tokped_tweet", body=D)
                  if count==maxTweet:
                      print('\nFinished streaming %.0f tweets' %(maxTweet)); self.disconnect()
              def on_error(self, status_code, data):
                  print('Error Status = %s' %status_code); self.disconnect()
      
          while count<maxTweet:
              stream = MyStreamer(Ck, Cs, At, As)
              stream.statuses.filter(track=topicS)
      
      maxTweet, count = 5000, 0
      lang = set(['id','en'])
      topicS = ['XXXX', 'XXXXXX', '@XXXXXX', '#XXXXXX'] 
      
      streamToElastic(topicS, lang)

      Mohon solusinya. Terima kasih.

    • #6814
      Taufik SutantoTaufik Sutanto
      Keymaster

      Sepertinya kesalahan terjadi bukan di elasticsearch-nya … tapi setting di streamer. Request tidak bisa melepas semua resources connection-nya saat response ditutup. Silahkan coba solusi disini: https://stackoverflow.com/questions/26638329/incompleteread-error-when-retrieving-twitter-data-using-python

Viewing 1 reply thread
  • You must be logged in to reply to this topic.