Replies Created | Forums | Alex M | Dharma Treasure Community

Alex M

@alex-m active 6 years, 10 months ago

Forum Replies Created

Viewing 5 posts - 1 through 5 (of 5 total)

Author

Posts
August 28, 2018 at 4:12 am #3298

Alex M
Member

Hello, editing works at least for me. “EDIT” link has low contrast and may not be visible well on some kinds of displays, see attachment.

Attachments:
You must be logged in to view attached files.

August 3, 2018 at 4:50 am #3232
Alex M
Member
Blake I agree and would like to add several things.

For each recording there can be several types of additional data, they can coexist, though some of them make others somewhat obsolete:
- William’s annotations for whole recording
- Timestamps with short summary & tags. I think searching for questions in recording would be easy if waveform is visually presented. Usually sound level of questions is quite low, and there are pauses around. Though often Culadasa tells more than can be expected from the question.
- Machine-generated transcription without any manual correction. It can be especially useful if it includes timestamps with sentence or word-level granularity. Can be used for fulltext search even if error rate is 20%. Though it’d not be useful for reading, user’d have to listen to the audio.
- Manual transcription, either fully manual or based on machine transcription with manual correction. Probably the best thing, especially if it has timestamps for those who’d like to listen or watch. Though live speech is different from written material in structure, and if transcribed 1:1 may look a bit strange, I don’t know if it matters. So it’s worth providing audio along with transcriptions.
So, first stage may be converting William’s annotations to a suitable format and adding hyperlinks. Then it may be enchanced by machine transcription, manual transcription or timestamping with annotations. After the first stage the project would be very helpful already. Later additions would add usability.

Another unrelated thing that would be possible if all media is gathered in one place is automatic renaming and retagging of mp3s using a common scheme, and offering .zip downloads of multiple recordings at once. Like, all “teaching retreats” in one zip and “uposatha days” in another, though traffic & storage costs should be considered. (I had to write a script to download and rename most of recordings from dharmathreasure, there are 500+ of them and doing it by hand is slow and error-prone)

I’ll go ahead and try to write a custom site and see how it goes. Currently the only risk is my time and it’s ok. There is no need to hold off or pause other efforts, since I may fail to produce satisfactory result for several reasons. If result turns out to be ok, then existing annotation/transcriptions/timestamps can be imported.
- This reply was modified 6 years, 10 months ago by Alex M.
August 2, 2018 at 5:56 am #3226

Alex M
Member

William W You are absolutely right about general direction. Details on how to do it depend on type of the wiki, choice of how data is structured, what external resources are used and how to link to them with timestamps. I think that it’d be useful to convert your annotations to wiki format as-is without timestamps and add a link to page with audio. Then timestamp new recordings, then get back and add timestamps where they are missing.

Ted What’s current status of the wiki? Can I be of any help?

July 31, 2018 at 8:23 am #3215
Alex M
Member
Ted: wiki would work quite fine, I haven’t thought of that.

Pros:
- works right now
- requires general sys admin knowledge to operate and support, if any. No need to look for a programmer that can work on the code, add a feature or fix a problem that popped up, that’s a very important thing.
- multi-user editing is built-in, including logs and rollbacks
- easily extended beyond a/v recordings, and there is already need for including posts from news groups.
Cons: wiki is a generic solution, so a lot of things will never be quite right or ergonomic. Here are some things from top of head, for some there may be solution, for some may not, depending on wiki engine:
- Support for tags may be absent or unsutable for the task, cross-referencing have to be done manually (e.g. page with everything that’s related to stage 4 could be autogenerated by custom code, but may have to be done manually in wiki, and something will get lost)
- Referencing specific part of a recording works for youtube links (start time can be embedded into link). It’s not obvious how to do that for soundcloud, or raw mp3 files, I haven’t looked deeply into this. Requiring users to seek manually to the specified time is not very convinient. Though it won’t stop people that really need the information.
- Soundcloud may go under, it nearly did about 2 years ago, so if it’s used for audio links, they can become broken.
- Displaying just one a/v player per page. Embedding a frame with youtube video for each fragment may be too heavy
- Generally ergonomics for people doing annotations and end users will suffer.
William W: thank you, that’s the great amount of work and can be of much use to people even is published right now as-is. Problem is how to make it visibile.

Blake: integrating into an existing web site expands scope and brings another set of restrictions and tradeoffs. There are several possibilities how to do this from full integration to proxying content or embedding a single-page app, but it’s substantially easier to implement the thing as a standalone site (or put it on a subdomain of dharmatreasure.org). Another question is how to organize things so the site will continue to work if something happens to the original developer/admin. Also if the site is integrated into dharmathreasure, it’s much more unlikely that scope will ever expand to other meditation schools, e.g. Shinzen, though I don’t know if it’s a good thing or not.
- This reply was modified 6 years, 11 months ago by Alex M.
July 30, 2018 at 6:57 am #3202

Alex M
Member

JavaJeff: Dealing with utterances, fixing grammar and such are more relevant to manual transcription efforts. I think somebody is working on it now and it’s both very useful and time-consuming. I feel like it’s not possible to transcribe all recordings in the near future, so this project aims to make recordings searchable in the meantime. Also some may prefer audio / video media.

Here is what I mean by example: in the recent patreon Q&A Culadasa gets asked about purifications during stage 4, awakening without cessations, … . Recording is manually preprocessed: admin marks beginning of each question and end of the answer, and adds keywords (tags, labels) like “stage 4”, “purifications” and short description what this part is about, like “Experience of stage 4 purifications from meditator’s perspective”. Fragment about cessations may have keywords “gradual awakening”, “gradual stream entry”, “no cessation”, “awakening without cessation” and be described by “What to focus on in practice if partial awakening happened without cessation”.

Users of the site then can search for specific tags, or use full-text search on descriptions. And then proceed to listen / view actual recordings.
Author

Posts

Viewing 5 posts - 1 through 5 (of 5 total)

Dharma Treasure Community

Alex M

Forum Replies Created

Attachments:

Recent Replies

Recent Topics