function readOnly(count){ }
Starting November 20, the site will be set to read-only. On December 4, 2023,
forum discussions will move to the Trailblazer Community.
+ Start a Discussion
Tom Flynn 19Tom Flynn 19 

Word Meaning and Word2Vec Trailhead: Construct_Examples method

I feel like I must be mising something with the code provided for this method. Near the beginning of the method the instructions tell you to create a window of size 'n'.  

Later on, you pop the center word out of the window so I believe the length of the window would now be n-1.

At the end of the CBOW senction there is the following if statement:
if len(window)<n:
   continue

It seems to me that len(window) is always n-1 at this point. As a result, you continue and thus never add anyting to the examples list so the CBOW samples end up zero.

What am I missing???

Tom

 
Deepali KulshresthaDeepali Kulshrestha
Hi Tom,

Greetings to you!

Use this code :- 

def construct_examples(numericalized_sentences, vocabulary, num_examples=int(1e6), n=5, sg=True, k=0):
  examples = []
  while True:
    # TODO: select a random sentence index using random.randint and get that
    # sentence. Be careful to avoid indexing errors.
    sentence_idx = random.randint(0,len(numericalized_sentences)-1)
    sentence = numericalized_sentences[sentence_idx]
    # TODO: Select a random window index using random.randint
    # and obtain that window of size n. Be careful to avoid indexing errors.
    window_idx = random.randint(0,len(sentence)-1)
    window = sentence[window_idx:n]
    
    if len(window) <= n//2:
      continue
      
    # TODO: Get the center word and the context words 
    center_word = window[int(round(len(window)/2))]
    context_words = window
    context_words.remove(center_word)
    
    # TODO: Create examples using the guidelines above
    if sg: # if Skip-Gram
      context_word = context_words[random.randint(0, len(context_words)-1)]
      example = [center_word, context_word]
    else: # if CBOW
      example = [context_words, center_word]
      if len(window) < n:
        continue
      
    if k > 0: # if doing negative sampling
      samples = [random.randint(0, len(vocabulary.index_to_word)-1) 
                 for _ in range(k)]
      example.append(samples)
      
    examples.append(example)
    if len(examples) >= num_examples:
      break
  
  return examples

I hope you find the above solution helpful. If it does, please mark as Best Answer to help others too.

Thanks and Regards,
Deepali Kulshrestha
Tom Flynn 19Tom Flynn 19
Thanks Deepali. I probably will not get back to this until late this week but I appreciate the help.

Sincerely,

Tom
Atif RazzaqAtif Razzaq
Hello Deepali

I am stuck on this trail as well, for Question 9 I am getting 39 which isn't one of the listed answers. Following in my code snippet:
 
sentence_idx = random.randint(0,len(numericalized_sentences)-1)
    sentence = numericalized_sentences[sentence_idx]
    # TODO: Select a random window index using random.randint
    # and obtain that window of size n. Be careful to avoid indexing errors.
    window_idx = random.randint(0,len(sentence)-1)
    window = sentence[window_idx:window_idx+n]
    if len(window) <= (n//2):
      continue
      
    # TODO: Get the center word and the context words 
    center_word = sentence[window_idx+(n//2)]
    context_words = sentence[window_idx:window_idx+(n//2)] + sentence[window_idx+(n//2)+1:window_idx+n]​​​​​​

I am struggling with "Word2VecModel(nn.Module)" class definition and "def train(...)". Can someone please kindly share the code? Hints and instructions given in the code are not clear and aren't helpful.

Thanks anticipation!!!