Automatically extracting values from multiple word document/Marking without the admin

Mon 28 April 2014 by Eoin Travers

During a PhD, marking coursework is either a necessary evil, a welcome distraction, or an invaluable income stream to support our extravagant lifestyles. Less welcome, though, is the necessary administration that comes with it, including, in our department, entering the raw mark, feedback, and comments on each essay/report into a feedback sheet, and later, figuring out our average marks, mark distributions, examples of highest, lowest, and average submissions, and so on.

Rather than do this manually, I found a script, posted by StackOverflow user stobin, which I adapted for use with our internal forms, although it could easily be made to work with other systems too.

I've posted the modified script below. If you have no experience with VBA, instructions are included as comments (that is, the lines starting with a ', which are ignored by the computer, but contain useful information for humans).

In our documents, the raw mark, which is what we need to extract, is in row 2, column 1, of the first table in the document. It should be relatively easy to change these numbers, or add additional lines to copy more than just that cell, for your own needs.

Sub Scrape_Marks()
' How to use
' ----------
' 1. Move all your marking sheets into a single folder
'   (you can have other files here too, as the marking sheets
'    can be identified by a common keyword, like "feedback")
' 2. Create a blank Excel document.
' 3. Open the VBA editor by pressing Alt + F11
' 4. Double-click 'ThisWorkbook' on the panel in the upper left part of the screen
'   (you might have to expand some of the other menus to find this, if they're collapsed).
' 5. Copy and paste this function (Starting at "Sub Scrape_Marks()" and ending at "End Sub") into
'   the blank page that appears.
' 6. Change the folder name below to point to the folder containing your documents.
' 7. Click anywhere on this text, and then on the green 'Play' arrow to run the function.
'    - You'll probably need to minimize all your windows, and look out for a box that pops up
'   asking if you want to open a read only copy of each document. Keep hitting Return, and this
'   will eventually go away.

FolderName = "C:\Users\40027000\Desktop\Eoin Marking\Extract" ' Change this to the folder containing your word documents
Dim keyword As String
keyword = "FEEDBACK" ' Change as needed

' Define some variables
Dim wrdDoc As Object, objFiles As Object, fso As Object, wordApp As Object
Dim sh1 As Worksheet
Dim x As Integer

Set sh1 = ThisWorkbook.Sheets(1)
Set fso = CreateObject("Scripting.FileSystemObject")
Set wordApp = CreateObject("Word.application")
Set objFiles = fso.GetFolder(FolderName).Files

' Set the column headings
sh1.Cells(1, 1).Value = "Student"
sh1.Cells(1, 2).Value = "Mark"

' Loop through the folder
x = 2 ' Start on row 2
For Each wd In objFiles
    If InStr(wd, keyword) And InStr(wd, "~") = 0 Then
        ' If a document has the keyword in it's name, do the following:
        Set wrdDoc = wordApp.Documents.Open(wd.Path, ReadOnly = True) ' Open it
        'Paste its name into column 1
        sh1.Cells(x, 1) = wd.Name
        ' Find the first table ("Tables(1)") - second row, first column (".Cell(Row:=2, Column:=1)").
        ' Paste this value into column 2
        ' Change these numbers as appropriate for your document (it might take some trial and error
        ' for complicated documents)
        sh1.Cells(x, 2) = Application.WorksheetFunction.Clean(wrdDoc.Tables(1).Cell(Row:=2, Column:=1).Range)

        x = x + 1 ' Move to the next row
        wrdDoc.Close ' Close the document
    End If
Next wd ' Move on to the next file


End Sub