(To try to probe at possible grey areas for further clarity:) What about a test that's created predominantly by humans but where AI was used to generate the incorrect multiple choice answers; then the test is manually printed out, distributed, collected, and scanned; and then the AI automatically grades all of the multiple choice and short answer questions, but humans grade the rest?
Very pedantic, but I'd count it as YES if it generated the questions by just referencing a database of human crafted questions, but not if a human manually crafts all the questions for an individual test; Also outside of grading multiple choice and short answer questions, what would be left? For long responses, I would side YES if the AI graded them, but the teacher could vet the AI's grading/reasoning.