[WIP] Automated CH metadata extractor

Discussion about Cock Hero and other sexy videos.

Moderator: andyp

User avatar
high_octane
Explorer At Heart
Explorer At Heart
Posts: 278
Joined: Fri Oct 19, 2018 3:17 am
Gender: Male
Sexual Orientation: Straight
I am a: None of the above

[WIP] Automated CH metadata extractor

Post by high_octane »

Essentially, I'm planning and working on a software project to make the lives of those interested in creating a database of CH metadata a lot easier, myself included. More specifically, this topic will be about my tentative outline regarding the automation of CH metadata collection via machine learning, and subsequent foray into usability. I understand this is a very niche topic, my apologies.

In order for this project to come to fruition, it will need to be very accurate. It won't be as accurate as manually watching CH videos for millions of hours while jotting down relevant info, but it will, hopefully, come very close. Since I'm utilizing machine learning (OpenCV), I will need a ton of photos of various pornstars in order to create a specifically trained model for them.

NOTE: I've never attempted any work with machine learning in projects prior, so I still need to take some time to familiarize myself with the library.


Here's a list of metadata that will be extracted from the CHs:
  • media info (width, height, video_encoder, audio_encoder, container, fps, creation_date, duration, etc...)
  • author(s)
  • good name (in case the name of the video is insufficient for uniquely identifying the CH)
  • # of beats (for calculating difficulty (will also include timestamps for use with external device synchronization))
  • # of rounds
  • global and per round statistics
  • models/pornstars
  • what the model is doing in the scene (to determine genre stats)
  • and anything I forgot to include :-P
Once the AI has detected the above criteria, it will finally output some form of human-readable data structure. It'll either be a CSV or JSON file, I haven't decided yet.

This will allow for the creation of a very rich and detailed database; one which will allow you to get figures for a particular CH round organized like so:

Code: Select all

CH Name: Round 1: models(25% modelname0, 25% modelname1, 50% modelname2), genre(20% HJ, 50% Tease, 30% BJ), difficulty(medium 1.4 BPS) etc...
I'll create a GitHub and/or GitLab repository and link it here once I've gotten things planned out and in a usable state, so stay tuned.

Git Repository:
https://gitlab.com/high_octane/chext

I've decided to solely use GitLab. If you already have a GitHub account, you can easily sign into Gitlab with it. :-)

In the future, if anyone becomes interested in collaborating, I'll greatly welcome and appreciate it. In terms of chatroom-esque software for collaboration, I cannot use Discord anonymously (and believe me, I have tried more times than I wish to recall), but I can use Riot.

Riot room for further discussion (give it a little time and it'll load):
https://riot.im/app/#/room/#chext:matrix.org

This is a huge undertaking. Will I survive? :crazy: :lol:
Last edited by high_octane on Sat Aug 31, 2019 3:15 am, edited 2 times in total.
My original Cock Hero songs can be found here:
https://high-octane-ch.bandcamp.com or https://archive.org/details/cock-hero-osts
Spoiler: show
"When I get home I'm going to let my apparatus out of its cage!" ~fragrantEmulsion

"The rhythm for that song is very complex, and I fear that if I mimic it with the beat meter, people will want to throw their shoes at me." ~high_octane
If you're wondering what my avatar is, it's my own design entitled "The Crest of Confusion".
User avatar
fragrantEmulsion
Explorer At Heart
Explorer At Heart
Posts: 403
Joined: Sun Aug 26, 2018 12:14 pm
Gender: Male
I am a: Switch

Re: [WIP] Automated CH metadata extractor

Post by fragrantEmulsion »

Can I have repo access?
User avatar
high_octane
Explorer At Heart
Explorer At Heart
Posts: 278
Joined: Fri Oct 19, 2018 3:17 am
Gender: Male
Sexual Orientation: Straight
I am a: None of the above

Re: [WIP] Automated CH metadata extractor

Post by high_octane »

fragrantEmulsion wrote: Wed Aug 14, 2019 3:25 pm Can I have repo access?
Sure thing. I don't a repository up at the moment, but I'll get around to that soon. Access to the repo will be available to everyone. :-)

I just set up a room on Riot for further discussions of this project, which I just added to the OP. This room is also accessible to anyone. I don't want to cloud up this forum too much with this, because it isn't exactly "On Video", but it is about Cock Hero.

That said, I will post major updates here. Discussion here is also fine if people don't feel like making an account for Riot.
My original Cock Hero songs can be found here:
https://high-octane-ch.bandcamp.com or https://archive.org/details/cock-hero-osts
Spoiler: show
"When I get home I'm going to let my apparatus out of its cage!" ~fragrantEmulsion

"The rhythm for that song is very complex, and I fear that if I mimic it with the beat meter, people will want to throw their shoes at me." ~high_octane
If you're wondering what my avatar is, it's my own design entitled "The Crest of Confusion".
Qiubi
Explorer
Explorer
Posts: 94
Joined: Fri Nov 16, 2018 6:32 pm

Re: [WIP] Automated CH metadata extractor

Post by Qiubi »

and... for the mere mortals that don't understand about this things.... what are you doing exactly? a program to watch random videos?
User avatar
high_octane
Explorer At Heart
Explorer At Heart
Posts: 278
Joined: Fri Oct 19, 2018 3:17 am
Gender: Male
Sexual Orientation: Straight
I am a: None of the above

Re: [WIP] Automated CH metadata extractor

Post by high_octane »

Qiubi wrote: Wed Aug 14, 2019 4:47 pm and... for the mere mortals that don't understand about this things.... what are you doing exactly? a program to watch random videos?
I'm creating a program which will analyze a Cock Hero video, and output various information about it. Thanks to progression in machine learning technology, it will become easier to extract that info. For instance, I can train the AI to look for things that look like the beginning of a round. If it identifies it, it will log a timestamp of when it saw it (and a timestamp of the first beat in the round for calculating difficulty). Then, when it identifies the next round, it will have a duration for the first round it found.

Then, once all of the info is collected, a different project which focuses on interactive CH can take that data (especially the round timestamp data) and use it to piece together a ton of different rounds into a new experience. How the rounds are concatenated could also be based on genre info or pornstar info, which is also something that my project aims to collect, and so on and so forth...

Sorry if that wasn't any clearer. I'm not the best at explaining things. :-P
My original Cock Hero songs can be found here:
https://high-octane-ch.bandcamp.com or https://archive.org/details/cock-hero-osts
Spoiler: show
"When I get home I'm going to let my apparatus out of its cage!" ~fragrantEmulsion

"The rhythm for that song is very complex, and I fear that if I mimic it with the beat meter, people will want to throw their shoes at me." ~high_octane
If you're wondering what my avatar is, it's my own design entitled "The Crest of Confusion".
User avatar
jamesredcool
Explorer
Explorer
Posts: 9
Joined: Tue Nov 15, 2016 6:57 pm
Gender: Male
Sexual Orientation: Straight
I am a: Dom (Male)
Location: Europe

Re: [WIP] Automated CH metadata extractor

Post by jamesredcool »

Absolute mad lad. It sounds very cool :thumbsup:
Qiubi
Explorer
Explorer
Posts: 94
Joined: Fri Nov 16, 2018 6:32 pm

Re: [WIP] Automated CH metadata extractor

Post by Qiubi »

high_octane wrote: Wed Aug 14, 2019 5:03 pm
Qiubi wrote: Wed Aug 14, 2019 4:47 pm and... for the mere mortals that don't understand about this things.... what are you doing exactly? a program to watch random videos?
I'm creating a program which will analyze a Cock Hero video, and output various information about it. Thanks to progression in machine learning technology, it will become easier to extract that info. For instance, I can train the AI to look for things that look like the beginning of a round. If it identifies it, it will log a timestamp of when it saw it (and a timestamp of the first beat in the round for calculating difficulty). Then, when it identifies the next round, it will have a duration for the first round it found.

Then, once all of the info is collected, a different project which focuses on interactive CH can take that data (especially the round timestamp data) and use it to piece together a ton of different rounds into a new experience. How the rounds are concatenated could also be based on genre info or pornstar info, which is also something that my project aims to collect, and so on and so forth...

Sorry if that wasn't any clearer. I'm not the best at explaining things. :-P

thanks that was more clear than the other thing xDD
User avatar
Rule63MePlease
Explorer At Heart
Explorer At Heart
Posts: 187
Joined: Sun Sep 20, 2015 4:13 am
I am a: None of the above

Re: [WIP] Automated CH metadata extractor

Post by Rule63MePlease »

This sounds very awesome! I have always wanted there to be an official system to rate the difficulty of a CH. Having a program that can count all the beats, beat changes/patterns, beats per round, beats per minute, total number of beats each round, and how many rounds a CH has would make that process are lot easier.
Next we will have to decide on what kind of score to give each of those numbers to get the total difficulty score.

This program could also be used for automating devices to play Cock Hero with,
kind of like this thing. http://cockheromachine.blogspot.com/2017/
User avatar
doremi
Experimentor
Experimentor
Posts: 1207
Joined: Sat Apr 23, 2016 11:09 pm
Gender: Male
Sexual Orientation: Straight
Contact:

Re: [WIP] Automated CH metadata extractor

Post by doremi »

high_octane, do you realise that if you develop this software as an official project while enroled in a Computer Science program, you could be the very first to earn a Cock Hero Doctorate degree. :w00t: :-D

As for a huge image bank to choose from, wouldn't it be nice to feed on the PornHub server drives? I guess you could write a website rip feature and use pic and tags. :lol:
[APP] Cock Hero Slideshow Player - Thinking about a script feature for [APP] Cock Hero Video Player :icecream:
If your video is too fat, there's a solution!
Spoiler: show
The generated output of your video editor may be bloated, too big for not any significant benefit. One thing you can do is use HANDBRAKE with the H.264 (x264), RF18 Constant Quality and Web Optimized / Fast Start options, all other options by default. You'd be surprised how smaller the video becomes, without any impact to the quality.
:yes:

LINKS:

HandBrake, The open source video transcoder
https://handbrake.fr/

For future reference, here's the original Hanbrake post by Eriol:
viewtopic.php?f=25&t=12815&hilit=Handbrake#p164242
Interesting for further details about the process.
:thumbsup:
So many projects to kill, so little time. :-)
User avatar
high_octane
Explorer At Heart
Explorer At Heart
Posts: 278
Joined: Fri Oct 19, 2018 3:17 am
Gender: Male
Sexual Orientation: Straight
I am a: None of the above

Re: [WIP] Automated CH metadata extractor

Post by high_octane »

Rule63MePlease wrote: Wed Aug 14, 2019 11:44 pm This sounds very awesome! I have always wanted there to be an official system to rate the difficulty of a CH. Having a program that can count all the beats, beat changes/patterns, beats per round, beats per minute, total number of beats each round, and how many rounds a CH has would make that process are lot easier.
I too have yearned for that, and this program will help achieve exactly that.
Rule63MePlease wrote: Wed Aug 14, 2019 11:44 pm Next we will have to decide on what kind of score to give each of those numbers to get the total difficulty score.
I posted this in a different thread, and my method hasn't changed since then. Basically, in order to get the D-weighted difficulty value, you need to add a pref value to the base difficulty. And by "D-weighted", I mean "dick-weighted", of course. :lol:
high_octane wrote: Thu Jun 13, 2019 3:31 am This is my idea for calculating the difficulty of a round:

Code: Select all

/**
 * nbeats: number of beats in the round
 * len:    length of the round in seconds (measured from the first beat to the last)
 * prefs:  content the user finds the most arousing (scale from -2.0 to +2.0, where - values are
 *                                                   less arousing and + values are more arousing)
 *
 * returns difficulty value (-inf to 0.4 = easiest, 0.5 to 0.9 = super easy, 1.0 to 1.4 = very easy,
 *                           1.5 to 1.9 = easy, 2.0 to 2.4 = medium, 2.5 to 2.9 = hard,
 *                           3.0 to 3.4 = very hard, 3.5 to 3.9 = super hard, 4.0 to inf = hardest)
 */
static inline double get_difficulty(size_t nbeats, double len, double prefs)
{
    return ((double)nbeats / len) + prefs;
}
The prefs implementation is a bit naive here but whatever.
doremi wrote: Wed Aug 14, 2019 11:53 pm high_octane, do you realise that if you develop this software as an official project while enroled in a Computer Science program, you could be the very first to earn a Cock Hero Doctorate degree. :w00t: :-D
I guess I'm Dr. high_octane now. :lol: But in all seriousness, I chose not to pursue tertiary education for various reasons. Most of the skills that I've learned are self-taught, because I had the will and desire to learn them. Now it is time to use my skills for the good of the Milovana community! :-D
doremi wrote: Wed Aug 14, 2019 11:53 pm As for a huge image bank to choose from, wouldn't it be nice to feed on the PornHub server drives? I guess you could write a website rip feature and use pic and tags. :lol:
That's not a bad idea! I'll look into it. To have a very accurate model, I'll probably need around 5,000 pics of each pornstar. Obviously, there probably aren't that many images of a particular pornstar, so around 1,000 should suffice.
My original Cock Hero songs can be found here:
https://high-octane-ch.bandcamp.com or https://archive.org/details/cock-hero-osts
Spoiler: show
"When I get home I'm going to let my apparatus out of its cage!" ~fragrantEmulsion

"The rhythm for that song is very complex, and I fear that if I mimic it with the beat meter, people will want to throw their shoes at me." ~high_octane
If you're wondering what my avatar is, it's my own design entitled "The Crest of Confusion".
User avatar
Rule63MePlease
Explorer At Heart
Explorer At Heart
Posts: 187
Joined: Sun Sep 20, 2015 4:13 am
I am a: None of the above

Re: [WIP] Automated CH metadata extractor

Post by Rule63MePlease »

I think the difficulty scale should range from 1 to 10, or 1 to 100. Going from 0.4 to 4.0 just seems like a very odd system. :-/
User avatar
high_octane
Explorer At Heart
Explorer At Heart
Posts: 278
Joined: Fri Oct 19, 2018 3:17 am
Gender: Male
Sexual Orientation: Straight
I am a: None of the above

Re: [WIP] Automated CH metadata extractor

Post by high_octane »

Rule63MePlease wrote: Fri Aug 16, 2019 2:53 pm I think the difficulty scale should range from 1 to 10, or 1 to 100. Going from 0.4 to 4.0 just seems like a very odd system. :-/
The reason it looks strange like that is because it directly modifies the beats per second value. That scale is based on the range of practical tempos in music.

For instance, 0.4 bps equals 24 bpm, and 4.0 bps equals 240 bpm. No song in CH will probably ever be that slow or fast, but if someone really really likes the content in a round (+2.0 pref) that is only 2.0 bps (120 bpm) "medium" base difficulty, then that round would be considered "hardest" difficulty for them in the D-weighted calculation.

Likewise, if someone truly despises the content with all of their being (-2.0 pref), then the "medium" base difficulty round would be considered "easiest" in D-weighted.

The scale also accounts for values that positively or negatively exceed that range as well, it's just capped at those values for determining the actual difficulty label.

Another thing to note: These values are only meant to be used internally. The actual difficulty ratings are the labels, like "easy", "medium", "hard", etc... But it is also nice to display the actual numbers, too, to see exactly how a round's difficulty label was calculated.

D-weighted sounds really corny to me, so what about this as an example:

Code: Select all

D  = very easy (1.3)
Dp = medium    (2.1)

where 'D' means difficulty or base difficulty, 'Dp' means difficulty+preferences, and 'p' equals +0.8
To make this more accurate, I need to constantly modify 'p' throughout the round, based on the content. This would allow for calculations where both pleasant and unpleasant content is featured in a round.
My original Cock Hero songs can be found here:
https://high-octane-ch.bandcamp.com or https://archive.org/details/cock-hero-osts
Spoiler: show
"When I get home I'm going to let my apparatus out of its cage!" ~fragrantEmulsion

"The rhythm for that song is very complex, and I fear that if I mimic it with the beat meter, people will want to throw their shoes at me." ~high_octane
If you're wondering what my avatar is, it's my own design entitled "The Crest of Confusion".
3xTripleXXX
Explorer At Heart
Explorer At Heart
Posts: 664
Joined: Wed Jun 14, 2017 8:35 pm
Gender: Male
Sexual Orientation: Straight

Re: [WIP] Automated CH metadata extractor

Post by 3xTripleXXX »

Both D and DP. Sounds like a winning scoring system! :-D
My latest Cock Hero is Sweet Mammaries, a Cock Hero Quickie.

I've also made 4-play, 4-play 2, Getting Down With The Thiccness, Fuck Hard Cum Harder, Filthy Cute and Kittens & Cream. You can stream them all on SpankBang or search up their announce threads here for other options. :)
User avatar
Rule63MePlease
Explorer At Heart
Explorer At Heart
Posts: 187
Joined: Sun Sep 20, 2015 4:13 am
I am a: None of the above

Re: [WIP] Automated CH metadata extractor

Post by Rule63MePlease »

I don't think there should be a cap. Yeah it's very unlikely a CH will go so high, but I still think it would be better for the final score to be displayed as a 1-10 or 1-100 for the number. I mean, 24 could just = 1.0 or 10 and 240 would = 10.0 or 100, If some CH does go passed the max, then oh well. :lol:

Also a long time ago I came up with a formula for a CH score. Not for difficulty of a CH but how far a player made it vs how many days they went without fapping. I need to go find that again.
User avatar
high_octane
Explorer At Heart
Explorer At Heart
Posts: 278
Joined: Fri Oct 19, 2018 3:17 am
Gender: Male
Sexual Orientation: Straight
I am a: None of the above

Re: [WIP] Automated CH metadata extractor

Post by high_octane »

Rule63MePlease wrote: Sat Aug 17, 2019 5:50 pm I don't think there should be a cap. Yeah it's very unlikely a CH will go so high, but I still think it would be better for the final score to be displayed as a 1-10 or 1-100 for the number. I mean, 24 could just = 1.0 or 10 and 240 would = 10.0 or 100, If some CH does go passed the max, then oh well. :lol:
The cap was referring to the label system. For instance, a D score of 3.2 with a pref value of 1.5 (Dp of 4.7) would be labelled the same difficulty (hardest) as a D score of 2.0 with a pref value of 2.0 (Dp of 4.0). Even if Dp was equal to 16.5, it would still be "hardest" difficulty. D and Dp have no bounds.

I'm not sure what the advantages are of using a range from 1-10 or 1-100. Maybe I'm misunderstanding something. :hmmm: I could convert my floating-point range into your integer range, but I'm just not sure how that would improve things. Is it purely for aesthetics?
Rule63MePlease wrote: Sat Aug 17, 2019 5:50 pm Also a long time ago I came up with a formula for a CH score. Not for difficulty of a CH but how far a player made it vs how many days they went without fapping. I need to go find that again.
Perhaps your scoring system could be integrated into my difficulty calculation as well. It sounds interesting. :yes:
My original Cock Hero songs can be found here:
https://high-octane-ch.bandcamp.com or https://archive.org/details/cock-hero-osts
Spoiler: show
"When I get home I'm going to let my apparatus out of its cage!" ~fragrantEmulsion

"The rhythm for that song is very complex, and I fear that if I mimic it with the beat meter, people will want to throw their shoes at me." ~high_octane
If you're wondering what my avatar is, it's my own design entitled "The Crest of Confusion".
Post Reply